Homework 3¶

ECS271¶

Kay Royo¶

You will learn how to train a deep network using PyTorch tool. Please read the following tutorial. You may skip the data parallelism section. https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

CIFAR10 is a dataset of 60, 000 color images of size 32 × 32 from 10 categories. Please download the PyTorch tutorial code for CIFAR10 to start: https://pytorch.org/tutorials/_downloads/cifar10_tutorial.py

When you run the tutorial code, it will download CIFAR10 dataset for you. Please follow the instructions in the following link to install PyTorch: https://pytorch.org/

To learn more, you can also find a tutorial for MNIST here:

https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py

and the sample model for MNIST here:

https://github.com/pytorch/examples/blob/master/mnist/main.py

and the sample code for Imagenet here:

https://github.com/pytorch/examples/blob/master/imagenet/main.py

For all the following sections, train the model for 50 epochs and plot the curve for loss, training accuracy, and test accuracy evaluated every epoch.

In [ ]:
# import dependencies 

import pandas as pd
import plotly
import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import numpy as np
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import time
import torch.optim as optim
import plotly.io as pio
pio.renderers.default='notebook'
from IPython.display import Image

(1)¶

Run the tutorial code out of the box and make sure you get reasonable results. You will report these results in Section 4, so no report needed here.

Notes:

Architecture:

  • Input: 32x32-pixel images with 3 channels (RGB) → 3x32x32 images
  • Convolutions with 3 input channels, 6 output channels, and 5x5 square convolution → 6x28x28 images
  • 2x2 max pooling (subsampling) → 6x14x14 images
  • 6 input channels (from the previous Conv2d layer), 16 output channels, 5x5 square convolutions → 16x10x10 images
  • 2x2 max pooling (subsampling) → 16x5x5 images
  • Fully connected linear (=dense) layer with 16x5x5=400 input size and 120 output; ReLU activation
  • Fully connected layer with 120 input and 84 output; ReLU activation
  • Fully connected output layer with 84 input and 10 output (for the 10 classes in the CIFAR10 dataset); no/linear activation

Note that the layers are defined in the constructor and the activations applied in the forward function.

Formula to calculate the output size of a convolutional layer:

$\frac{𝑊−𝐾+2𝑃}{𝑆}+1$ with input size 𝑊 (width and height for square images), convolution size 𝐾 , padding 𝑃 (default 0), and stride 𝑆 (default 1).

In [ ]:
# Run tutorial original code
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Files already downloaded and verified
Files already downloaded and verified
  car  deer  frog truck
2022-12-01T19:07:35.724660 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.softmax(self.fc3(x), dim=1)
        return x
        
net = Net()
print(net)
Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)
In [ ]:
total = 0
print('Trainable parameters:')
for name, param in net.named_parameters():
    if param.requires_grad:
        print(name, '\t', param.numel())
        total += param.numel()
print()
print('Total', '\t', total)
Trainable parameters:
conv1.weight 	 450
conv1.bias 	 6
conv2.weight 	 2400
conv2.bias 	 16
fc1.weight 	 48000
fc1.bias 	 120
fc2.weight 	 10080
fc2.bias 	 84
fc3.weight 	 840
fc3.bias 	 10

Total 	 62006
In [ ]:
import time
import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
In [ ]:
%%time

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')
[1,  2000] loss: 2.069
[1,  4000] loss: 2.048
[1,  6000] loss: 2.042
[1,  8000] loss: 2.035
[1, 10000] loss: 2.029
[1, 12000] loss: 2.026
[2,  2000] loss: 2.009
[2,  4000] loss: 2.009
[2,  6000] loss: 2.000
[2,  8000] loss: 2.004
[2, 10000] loss: 1.994
[2, 12000] loss: 1.994
Finished Training
CPU times: total: 2min 18s
Wall time: 1min 17s
In [ ]:
dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))


outputs = net(images)

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))
GroundTruth:    cat  ship  ship plane
Predicted:   frog  ship   car  ship
Accuracy of the network on the 10000 test images: 45 %
Accuracy of plane : 45 %
Accuracy of   car : 84 %
Accuracy of  bird : 30 %
Accuracy of   cat : 23 %
Accuracy of  deer : 14 %
Accuracy of   dog : 62 %
Accuracy of  frog : 55 %
Accuracy of horse : 50 %
Accuracy of  ship : 50 %
Accuracy of truck : 33 %
cpu
2022-11-27T16:19:15.307463 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assume that we are on a CUDA machine, then this should print a CUDA device:

print(device)
cpu

(2)¶

Change the code to have only a single fully connected layer. The model will have a single layer that connects the input to the output. What is the number of parameters? In PyTorch, ”nn.Linear” can be used for fully connected layer.

Answer:

The number of learnable parameters are

Layer Activation shape Activation size # Parameters
Input layer (32,32,3) 32 x 32 x 3 = 3072 0
FC1 (3072,10) 3072 x 10 = 30720 30730

The number of parameters in input layer is 0 since the it has nothing to learn and only provides the shape of the input image so there are 0 learnable parameters.

The number of parameters in fully connected layer 1 (FC1) is computed as (current layer c = 10) x (previous layer p = 3072) + (1 x c=10) where the second term (1 x c) added represents the bias term.

In [ ]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Files already downloaded and verified
Files already downloaded and verified
 bird   car   dog   dog
2022-12-01T20:16:59.321337 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
class FullNet1(nn.Module):
    def __init__(self):
        super(FullNet1, self).__init__()
        self.fc1 = nn.Linear(32*32*3, 10) # nn.Linear(in_features, out_features)

    def forward(self, x):
        x = x.view(-1, 32*32*3)
        x = self.fc1(x)
        #x = F.softmax(x, dim=1)
        return x


FullNet1 = FullNet1().to(device)
print(FullNet1)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(FullNet1.parameters(), lr=0.001, momentum=0.9)
FullNet1(
  (fc1): Linear(in_features=3072, out_features=10, bias=True)
)
In [ ]:
# print learnable parameters 
total = 0
print('Trainable parameters:')
for name, param in FullNet1.named_parameters():
    if param.requires_grad:
        print(name, '\t', param.numel())
        total += param.numel()
print()
print('Total', '\t', total)
Trainable parameters:
fc1.weight 	 30720
fc1.bias 	 10

Total 	 30730
In [ ]:
for name, p in FullNet1.named_parameters():
    print(name, ',', p.size(), type(p))
fc1.weight , torch.Size([10, 3072]) <class 'torch.nn.parameter.Parameter'>
fc1.bias , torch.Size([10]) <class 'torch.nn.parameter.Parameter'>
In [ ]:
#print dimension of forward output 
x = FullNet1.forward(images)
print(x.shape)
torch.Size([4, 10])
In [ ]:
#function for accuracy 
def accuracy(pred, labels):
  _, predicted = torch.max(pred, dim=1)
  correct_pred = torch.sum(predicted==labels).item()
  total_pred = len(predicted)
  accuracy = 100*(correct_pred/total_pred)
  return accuracy
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 
for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = FullNet1(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = FullNet1(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 2.1543350004715407 | train accuracy: 32.148 | test accuracy: 33.79 
*** Epoch 1 ***
loss: 2.1054822382172094 | train accuracy: 34.362 | test accuracy: 31.71 
*** Epoch 2 ***
loss: 2.086504551289878 | train accuracy: 34.882 | test accuracy: 32.63 
*** Epoch 3 ***
loss: 2.0733050030261024 | train accuracy: 35.386 | test accuracy: 34.09 
*** Epoch 4 ***
loss: 2.0614900813847887 | train accuracy: 35.818 | test accuracy: 33.76 
*** Epoch 5 ***
loss: 2.051229595792266 | train accuracy: 36.114 | test accuracy: 33.08 
*** Epoch 6 ***
loss: 2.0450370221524174 | train accuracy: 36.362 | test accuracy: 33.63 
*** Epoch 7 ***
loss: 2.042748621767097 | train accuracy: 36.446 | test accuracy: 32.23 
*** Epoch 8 ***
loss: 2.034762818364203 | train accuracy: 36.856 | test accuracy: 33.03 
*** Epoch 9 ***
loss: 2.0292925003569398 | train accuracy: 36.824 | test accuracy: 32.4 
*** Epoch 10 ***
loss: 2.0287099420471146 | train accuracy: 36.854 | test accuracy: 31.66 
*** Epoch 11 ***
loss: 2.01686352734466 | train accuracy: 37.068 | test accuracy: 33.07 
*** Epoch 12 ***
loss: 2.026159887845046 | train accuracy: 37.144 | test accuracy: 33.02 
*** Epoch 13 ***
loss: 2.013506087477729 | train accuracy: 37.314 | test accuracy: 32.61 
*** Epoch 14 ***
loss: 2.0154630682896935 | train accuracy: 37.16 | test accuracy: 33.66 
*** Epoch 15 ***
loss: 2.0084819334678885 | train accuracy: 37.52 | test accuracy: 34.45 
*** Epoch 16 ***
loss: 2.009313545784029 | train accuracy: 37.328 | test accuracy: 34.9 
*** Epoch 17 ***
loss: 2.0008360686237996 | train accuracy: 37.762 | test accuracy: 32.4 
*** Epoch 18 ***
loss: 2.008681913885329 | train accuracy: 37.608 | test accuracy: 29.79 
*** Epoch 19 ***
loss: 1.9947427464248952 | train accuracy: 37.638 | test accuracy: 31.88 
*** Epoch 20 ***
loss: 1.9982284661222982 | train accuracy: 37.67 | test accuracy: 33.88 
*** Epoch 21 ***
loss: 2.000263682827896 | train accuracy: 37.546 | test accuracy: 34.42 
*** Epoch 22 ***
loss: 2.000985434050989 | train accuracy: 37.948 | test accuracy: 33.49 
*** Epoch 23 ***
loss: 1.9982194488826241 | train accuracy: 37.66 | test accuracy: 33.62 
*** Epoch 24 ***
loss: 1.986224498437332 | train accuracy: 38.196 | test accuracy: 34.07 
*** Epoch 25 ***
loss: 1.9854518105465562 | train accuracy: 38.0 | test accuracy: 31.5 
*** Epoch 26 ***
loss: 1.9841910950611215 | train accuracy: 37.982 | test accuracy: 31.53 
*** Epoch 27 ***
loss: 1.9882920984605288 | train accuracy: 38.04 | test accuracy: 31.4 
*** Epoch 28 ***
loss: 1.9869283489678486 | train accuracy: 37.966 | test accuracy: 33.45 
*** Epoch 29 ***
loss: 1.985974595052608 | train accuracy: 38.358 | test accuracy: 30.66 
*** Epoch 30 ***
loss: 1.9835525769646458 | train accuracy: 38.204 | test accuracy: 33.63 
*** Epoch 31 ***
loss: 1.9833532526203324 | train accuracy: 38.17 | test accuracy: 33.43 
*** Epoch 32 ***
loss: 1.9756543790638872 | train accuracy: 38.26 | test accuracy: 33.36 
*** Epoch 33 ***
loss: 1.9836211827551495 | train accuracy: 38.256 | test accuracy: 32.07 
*** Epoch 34 ***
loss: 1.965811659115292 | train accuracy: 38.486 | test accuracy: 34.38 
*** Epoch 35 ***
loss: 1.980186080483043 | train accuracy: 38.274 | test accuracy: 34.03 
*** Epoch 36 ***
loss: 1.9741319785707807 | train accuracy: 38.204 | test accuracy: 34.27 
*** Epoch 37 ***
loss: 1.9753801281462526 | train accuracy: 38.35 | test accuracy: 30.81 
*** Epoch 38 ***
loss: 1.9734200985348314 | train accuracy: 38.55 | test accuracy: 33.46 
*** Epoch 39 ***
loss: 1.9717167868622494 | train accuracy: 38.54 | test accuracy: 31.96 
*** Epoch 40 ***
loss: 1.9753100692526047 | train accuracy: 38.918 | test accuracy: 33.5 
*** Epoch 41 ***
loss: 1.9692233857153387 | train accuracy: 38.416 | test accuracy: 30.86 
*** Epoch 42 ***
loss: 1.9676136678380245 | train accuracy: 38.718 | test accuracy: 33.22 
*** Epoch 43 ***
loss: 1.9668692260222373 | train accuracy: 38.796 | test accuracy: 33.03 
*** Epoch 44 ***
loss: 1.9624177527786284 | train accuracy: 38.576 | test accuracy: 31.88 
*** Epoch 45 ***
loss: 1.9672753758047765 | train accuracy: 38.63 | test accuracy: 33.85 
*** Epoch 46 ***
loss: 1.9663070780447516 | train accuracy: 38.696 | test accuracy: 34.85 
*** Epoch 47 ***
loss: 1.96525482703735 | train accuracy: 38.47 | test accuracy: 33.34 
*** Epoch 48 ***
loss: 1.96224342102036 | train accuracy: 38.742 | test accuracy: 33.34 
*** Epoch 49 ***
loss: 1.9616173026702224 | train accuracy: 38.77 | test accuracy: 31.5 
CPU times: total: 39min 16s
Wall time: 25min 10s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 2.154335 32.148 33.79
1 1 2.105482 34.362 31.71
2 2 2.086505 34.882 32.63
3 3 2.073305 35.386 34.09
4 4 2.061490 35.818 33.76
5 5 2.051230 36.114 33.08
6 6 2.045037 36.362 33.63
7 7 2.042749 36.446 32.23
8 8 2.034763 36.856 33.03
9 9 2.029293 36.824 32.40
10 10 2.028710 36.854 31.66
11 11 2.016864 37.068 33.07
12 12 2.026160 37.144 33.02
13 13 2.013506 37.314 32.61
14 14 2.015463 37.160 33.66
15 15 2.008482 37.520 34.45
16 16 2.009314 37.328 34.90
17 17 2.000836 37.762 32.40
18 18 2.008682 37.608 29.79
19 19 1.994743 37.638 31.88
20 20 1.998228 37.670 33.88
21 21 2.000264 37.546 34.42
22 22 2.000985 37.948 33.49
23 23 1.998219 37.660 33.62
24 24 1.986224 38.196 34.07
25 25 1.985452 38.000 31.50
26 26 1.984191 37.982 31.53
27 27 1.988292 38.040 31.40
28 28 1.986928 37.966 33.45
29 29 1.985975 38.358 30.66
30 30 1.983553 38.204 33.63
31 31 1.983353 38.170 33.43
32 32 1.975654 38.260 33.36
33 33 1.983621 38.256 32.07
34 34 1.965812 38.486 34.38
35 35 1.980186 38.274 34.03
36 36 1.974132 38.204 34.27
37 37 1.975380 38.350 30.81
38 38 1.973420 38.550 33.46
39 39 1.971717 38.540 31.96
40 40 1.975310 38.918 33.50
41 41 1.969223 38.416 30.86
42 42 1.967614 38.718 33.22
43 43 1.966869 38.796 33.03
44 44 1.962418 38.576 31.88
45 45 1.967275 38.630 33.85
46 46 1.966307 38.696 34.85
47 47 1.965255 38.470 33.34
48 48 1.962243 38.742 33.34
49 49 1.961617 38.770 31.50
In [ ]:
#Plot

fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')

fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')

subop = {'Train Loss': df[ 'Train Loss']}

for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )

fig.show()
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')

fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')

subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }

for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )

fig.show()

(3)¶

Change the code to have multiple fully connected layers. Try having a layer from input to 110 neurons and then a layer to 74 neurons, and finally a layer to 10 neurons, one for each category. What happens if you do not use ReLU? Describe why

Answer:

If we don't use ReLU, the accuracy of the network on the 10000 test images decreases but there only a slight difference between training time. Using ReLU in CNN tends to have better performance since it decreases the likelihood of vanishing gradient and helps with faster learning. ReLU helps eliminate the vanishing gradient issue by returning 1 for positive input value and 0 for negative input value and then it assigns a derivative of 1 for values that are greater than zero. Also, neural networks are built with perceptron so we need activation functions like ReLU to implement the classification process locally and globally. Without using activations functions like ReLU a neural network model would not have the best ability to classify sinc it would only be a large linear mapping of input and output. In other words, the model would only be a simple linear transformation, which would not work well for multi-classification. In short, the non-linearity of the activation functions like ReLU help improve the classification ability of a neural network by acting as a global or local function approximator.

With ReLU¶

In [ ]:
#use the same dataloader as part 2
# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
  cat   car  bird  ship
2022-11-27T21:03:04.297662 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
class FullNet2(nn.Module):
    def __init__(self):
        super(FullNet2, self).__init__()
        self.fc1 = nn.Linear(32 * 32 * 3, 110)
        self.fc2 = nn.Linear(110, 74)
        self.fc3 = nn.Linear(74, 10) # nn.Linear(in_features, out_features)

    def forward(self, x):
        x = x.view(-1, 32 * 32 * 3)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        #x = F.softmax(x, dim=1) 
        return x


FullNet2 = FullNet2()
print(FullNet2)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(FullNet2.parameters(), lr=0.001, momentum=0.9)
FullNet2(
  (fc1): Linear(in_features=3072, out_features=110, bias=True)
  (fc2): Linear(in_features=110, out_features=74, bias=True)
  (fc3): Linear(in_features=74, out_features=10, bias=True)
)
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = FullNet2(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = FullNet2(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 1.677417224572176 | train accuracy: 40.33 | test accuracy: 46.96 
*** Epoch 1 ***
loss: 1.4722141865129728 | train accuracy: 47.886 | test accuracy: 49.43 
*** Epoch 2 ***
loss: 1.3785199953989893 | train accuracy: 51.168 | test accuracy: 49.69 
*** Epoch 3 ***
loss: 1.3133843888423806 | train accuracy: 53.5 | test accuracy: 49.98 
*** Epoch 4 ***
loss: 1.262113986824923 | train accuracy: 55.01 | test accuracy: 48.81 
*** Epoch 5 ***
loss: 1.2196890833935974 | train accuracy: 56.466 | test accuracy: 50.22 
*** Epoch 6 ***
loss: 1.183028033992839 | train accuracy: 57.776 | test accuracy: 50.86 
*** Epoch 7 ***
loss: 1.1474604078730433 | train accuracy: 58.76 | test accuracy: 51.56 
*** Epoch 8 ***
loss: 1.1121967516424325 | train accuracy: 60.16 | test accuracy: 51.62 
*** Epoch 9 ***
loss: 1.0891586131139004 | train accuracy: 61.034 | test accuracy: 51.42 
*** Epoch 10 ***
loss: 1.0642123444314922 | train accuracy: 61.96 | test accuracy: 51.0 
*** Epoch 11 ***
loss: 1.0396081625594729 | train accuracy: 62.796 | test accuracy: 50.9 
*** Epoch 12 ***
loss: 1.0145432443852538 | train accuracy: 63.48 | test accuracy: 50.5 
*** Epoch 13 ***
loss: 0.9966886921460193 | train accuracy: 64.174 | test accuracy: 51.97 
*** Epoch 14 ***
loss: 0.9704105486637764 | train accuracy: 65.074 | test accuracy: 51.83 
*** Epoch 15 ***
loss: 0.9518514145351216 | train accuracy: 65.714 | test accuracy: 51.62 
*** Epoch 16 ***
loss: 0.9354019439944266 | train accuracy: 66.42 | test accuracy: 50.98 
*** Epoch 17 ***
loss: 0.9240237494753063 | train accuracy: 66.954 | test accuracy: 50.35 
*** Epoch 18 ***
loss: 0.9047816947992553 | train accuracy: 67.42 | test accuracy: 50.42 
*** Epoch 19 ***
loss: 0.8847028752911845 | train accuracy: 68.176 | test accuracy: 50.67 
*** Epoch 20 ***
loss: 0.8732586350447772 | train accuracy: 68.386 | test accuracy: 50.1 
*** Epoch 21 ***
loss: 0.8569765430402718 | train accuracy: 69.11 | test accuracy: 50.97 
*** Epoch 22 ***
loss: 0.8407594208526866 | train accuracy: 69.558 | test accuracy: 49.31 
*** Epoch 23 ***
loss: 0.8322516559832821 | train accuracy: 70.136 | test accuracy: 50.36 
*** Epoch 24 ***
loss: 0.8179179523392228 | train accuracy: 70.524 | test accuracy: 50.21 
*** Epoch 25 ***
loss: 0.8025588082027872 | train accuracy: 71.13 | test accuracy: 48.71 
*** Epoch 26 ***
loss: 0.793839994160811 | train accuracy: 71.184 | test accuracy: 49.72 
*** Epoch 27 ***
loss: 0.7803896164078054 | train accuracy: 71.844 | test accuracy: 49.98 
*** Epoch 28 ***
loss: 0.7689069735225047 | train accuracy: 72.332 | test accuracy: 50.46 
*** Epoch 29 ***
loss: 0.7589200108683987 | train accuracy: 72.532 | test accuracy: 48.6 
*** Epoch 30 ***
loss: 0.7473999898873916 | train accuracy: 73.074 | test accuracy: 50.17 
*** Epoch 31 ***
loss: 0.7383423211685972 | train accuracy: 73.12 | test accuracy: 48.84 
*** Epoch 32 ***
loss: 0.7255527282457658 | train accuracy: 73.762 | test accuracy: 48.71 
*** Epoch 33 ***
loss: 0.7203607692083543 | train accuracy: 73.816 | test accuracy: 48.69 
*** Epoch 34 ***
loss: 0.7064822803799714 | train accuracy: 74.652 | test accuracy: 49.65 
*** Epoch 35 ***
loss: 0.6986588598172531 | train accuracy: 75.098 | test accuracy: 49.93 
*** Epoch 36 ***
loss: 0.6903838735612146 | train accuracy: 75.106 | test accuracy: 49.22 
*** Epoch 37 ***
loss: 0.6809352324905338 | train accuracy: 75.368 | test accuracy: 48.68 
*** Epoch 38 ***
loss: 0.6692599252853705 | train accuracy: 75.652 | test accuracy: 48.61 
*** Epoch 39 ***
loss: 0.6661890551630688 | train accuracy: 75.948 | test accuracy: 49.21 
*** Epoch 40 ***
loss: 0.6597034048847507 | train accuracy: 76.236 | test accuracy: 48.49 
*** Epoch 41 ***
loss: 0.6484266740787779 | train accuracy: 76.686 | test accuracy: 48.16 
*** Epoch 42 ***
loss: 0.6421853214914404 | train accuracy: 76.974 | test accuracy: 48.21 
*** Epoch 43 ***
loss: 0.6333777473474241 | train accuracy: 77.276 | test accuracy: 47.6 
*** Epoch 44 ***
loss: 0.6313689367688495 | train accuracy: 77.308 | test accuracy: 48.61 
*** Epoch 45 ***
loss: 0.6274228803846482 | train accuracy: 77.616 | test accuracy: 48.62 
*** Epoch 46 ***
loss: 0.6193685014012814 | train accuracy: 77.656 | test accuracy: 47.9 
*** Epoch 47 ***
loss: 0.6095266558651954 | train accuracy: 78.016 | test accuracy: 48.02 
*** Epoch 48 ***
loss: 0.6037740416973746 | train accuracy: 78.302 | test accuracy: 48.22 
*** Epoch 49 ***
loss: 0.6059272893938579 | train accuracy: 78.12 | test accuracy: 47.53 
CPU times: total: 45min 24s
Wall time: 38min 30s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 1.677417 40.330 46.96
1 1 1.472214 47.886 49.43
2 2 1.378520 51.168 49.69
3 3 1.313384 53.500 49.98
4 4 1.262114 55.010 48.81
5 5 1.219689 56.466 50.22
6 6 1.183028 57.776 50.86
7 7 1.147460 58.760 51.56
8 8 1.112197 60.160 51.62
9 9 1.089159 61.034 51.42
10 10 1.064212 61.960 51.00
11 11 1.039608 62.796 50.90
12 12 1.014543 63.480 50.50
13 13 0.996689 64.174 51.97
14 14 0.970411 65.074 51.83
15 15 0.951851 65.714 51.62
16 16 0.935402 66.420 50.98
17 17 0.924024 66.954 50.35
18 18 0.904782 67.420 50.42
19 19 0.884703 68.176 50.67
20 20 0.873259 68.386 50.10
21 21 0.856977 69.110 50.97
22 22 0.840759 69.558 49.31
23 23 0.832252 70.136 50.36
24 24 0.817918 70.524 50.21
25 25 0.802559 71.130 48.71
26 26 0.793840 71.184 49.72
27 27 0.780390 71.844 49.98
28 28 0.768907 72.332 50.46
29 29 0.758920 72.532 48.60
30 30 0.747400 73.074 50.17
31 31 0.738342 73.120 48.84
32 32 0.725553 73.762 48.71
33 33 0.720361 73.816 48.69
34 34 0.706482 74.652 49.65
35 35 0.698659 75.098 49.93
36 36 0.690384 75.106 49.22
37 37 0.680935 75.368 48.68
38 38 0.669260 75.652 48.61
39 39 0.666189 75.948 49.21
40 40 0.659703 76.236 48.49
41 41 0.648427 76.686 48.16
42 42 0.642185 76.974 48.21
43 43 0.633378 77.276 47.60
44 44 0.631369 77.308 48.61
45 45 0.627423 77.616 48.62
46 46 0.619369 77.656 47.90
47 47 0.609527 78.016 48.02
48 48 0.603774 78.302 48.22
49 49 0.605927 78.120 47.53
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')

fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')

subop = {'Train Loss': df[ 'Train Loss']}

for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )

fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/3a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')

fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')

subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }

for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )

fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/3b.png')
Out[ ]:

Without ReLU¶

In [ ]:
#same data loader as part 2 
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
horse truck   dog truck
2022-11-25T21:06:48.959512 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
class FullNet3(nn.Module):
    def __init__(self):
        super(FullNet3, self).__init__()
        self.fc1 = nn.Linear(32 * 32 * 3, 110)
        self.fc2 = nn.Linear(110, 74)
        self.fc3 = nn.Linear(74, 10) # nn.Linear(in_features, out_features)

    def forward(self, x):
        x = x.view(-1, 32 * 32 * 3)
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        #x = F.softmax(self.fc3(x), dim=1)
        return x


FullNet3 = FullNet3()
print(FullNet3)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(FullNet3.parameters(), lr=0.001, momentum=0.9)
FullNet3(
  (fc1): Linear(in_features=3072, out_features=110, bias=True)
  (fc2): Linear(in_features=110, out_features=74, bias=True)
  (fc3): Linear(in_features=74, out_features=10, bias=True)
)
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = FullNet3(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = FullNet3(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 1.824201954551788 | train accuracy: 36.024 | test accuracy: 37.61 
*** Epoch 1 ***
loss: 1.7607993198348117 | train accuracy: 39.022 | test accuracy: 38.96 
*** Epoch 2 ***
loss: 1.744571483924929 | train accuracy: 39.488 | test accuracy: 38.15 
*** Epoch 3 ***
loss: 1.735694215711436 | train accuracy: 40.172 | test accuracy: 38.81 
*** Epoch 4 ***
loss: 1.727805276860408 | train accuracy: 40.246 | test accuracy: 38.78 
*** Epoch 5 ***
loss: 1.7200337460192883 | train accuracy: 40.51 | test accuracy: 38.92 
*** Epoch 6 ***
loss: 1.7175825842905734 | train accuracy: 41.084 | test accuracy: 38.93 
*** Epoch 7 ***
loss: 1.710343520393018 | train accuracy: 41.2 | test accuracy: 39.8 
*** Epoch 8 ***
loss: 1.7092819672764297 | train accuracy: 41.4 | test accuracy: 39.29 
*** Epoch 9 ***
loss: 1.7052542597022053 | train accuracy: 41.39 | test accuracy: 38.86 
*** Epoch 10 ***
loss: 1.7036450538375643 | train accuracy: 41.286 | test accuracy: 39.18 
*** Epoch 11 ***
loss: 1.6997909060892404 | train accuracy: 41.828 | test accuracy: 38.65 
*** Epoch 12 ***
loss: 1.698652484212573 | train accuracy: 41.642 | test accuracy: 39.42 
*** Epoch 13 ***
loss: 1.6942036420104503 | train accuracy: 41.896 | test accuracy: 39.41 
*** Epoch 14 ***
loss: 1.6932806528509212 | train accuracy: 42.072 | test accuracy: 38.92 
*** Epoch 15 ***
loss: 1.690842537250373 | train accuracy: 42.102 | test accuracy: 39.5 
*** Epoch 16 ***
loss: 1.6900787286633863 | train accuracy: 41.898 | test accuracy: 39.71 
*** Epoch 17 ***
loss: 1.6872462848776177 | train accuracy: 42.15 | test accuracy: 36.56 
*** Epoch 18 ***
loss: 1.6872988571471352 | train accuracy: 42.234 | test accuracy: 39.19 
*** Epoch 19 ***
loss: 1.6842311128958463 | train accuracy: 42.202 | test accuracy: 38.91 
*** Epoch 20 ***
loss: 1.6827458305848733 | train accuracy: 42.182 | test accuracy: 39.32 
*** Epoch 21 ***
loss: 1.6806149077859913 | train accuracy: 42.518 | test accuracy: 39.09 
*** Epoch 22 ***
loss: 1.6809690411338445 | train accuracy: 42.39 | test accuracy: 39.3 
*** Epoch 23 ***
loss: 1.679326337214441 | train accuracy: 42.452 | test accuracy: 38.67 
*** Epoch 24 ***
loss: 1.6774064619053897 | train accuracy: 42.404 | test accuracy: 38.7 
*** Epoch 25 ***
loss: 1.6754016113255976 | train accuracy: 42.602 | test accuracy: 38.38 
*** Epoch 26 ***
loss: 1.6738855119504608 | train accuracy: 42.528 | test accuracy: 38.02 
*** Epoch 27 ***
loss: 1.674441500590967 | train accuracy: 42.44 | test accuracy: 38.56 
*** Epoch 28 ***
loss: 1.672919089049852 | train accuracy: 42.84 | test accuracy: 38.91 
*** Epoch 29 ***
loss: 1.6733607326887046 | train accuracy: 42.752 | test accuracy: 37.89 
*** Epoch 30 ***
loss: 1.6703083879976848 | train accuracy: 42.724 | test accuracy: 38.54 
*** Epoch 31 ***
loss: 1.6688603026450697 | train accuracy: 42.746 | test accuracy: 38.56 
*** Epoch 32 ***
loss: 1.6678179404727744 | train accuracy: 42.924 | test accuracy: 36.61 
*** Epoch 33 ***
loss: 1.6655160454509068 | train accuracy: 43.008 | test accuracy: 39.47 
*** Epoch 34 ***
loss: 1.6660944289689599 | train accuracy: 43.12 | test accuracy: 38.23 
*** Epoch 35 ***
loss: 1.6669652722403376 | train accuracy: 42.832 | test accuracy: 38.43 
*** Epoch 36 ***
loss: 1.6656390795641702 | train accuracy: 42.974 | test accuracy: 38.23 
*** Epoch 37 ***
loss: 1.665472630136289 | train accuracy: 42.936 | test accuracy: 38.44 
*** Epoch 38 ***
loss: 1.6635169804022307 | train accuracy: 43.022 | test accuracy: 38.09 
*** Epoch 39 ***
loss: 1.6609962129067857 | train accuracy: 43.14 | test accuracy: 37.84 
*** Epoch 40 ***
loss: 1.661804382191037 | train accuracy: 42.92 | test accuracy: 38.62 
*** Epoch 41 ***
loss: 1.6610926077507793 | train accuracy: 43.092 | test accuracy: 39.73 
*** Epoch 42 ***
loss: 1.6619592911816186 | train accuracy: 43.072 | test accuracy: 39.15 
*** Epoch 43 ***
loss: 1.65882894882241 | train accuracy: 43.122 | test accuracy: 37.86 
*** Epoch 44 ***
loss: 1.6575101103181504 | train accuracy: 43.17 | test accuracy: 38.24 
*** Epoch 45 ***
loss: 1.658654462653109 | train accuracy: 43.292 | test accuracy: 38.84 
*** Epoch 46 ***
loss: 1.6587285991644551 | train accuracy: 43.208 | test accuracy: 37.83 
*** Epoch 47 ***
loss: 1.6576509813121132 | train accuracy: 43.232 | test accuracy: 38.01 
*** Epoch 48 ***
loss: 1.6560799847989953 | train accuracy: 43.266 | test accuracy: 38.73 
*** Epoch 49 ***
loss: 1.6559563134515636 | train accuracy: 43.416 | test accuracy: 39.07 
CPU times: total: 43min 59s
Wall time: 35min 43s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 1.824202 36.024 37.61
1 1 1.760799 39.022 38.96
2 2 1.744571 39.488 38.15
3 3 1.735694 40.172 38.81
4 4 1.727805 40.246 38.78
5 5 1.720034 40.510 38.92
6 6 1.717583 41.084 38.93
7 7 1.710344 41.200 39.80
8 8 1.709282 41.400 39.29
9 9 1.705254 41.390 38.86
10 10 1.703645 41.286 39.18
11 11 1.699791 41.828 38.65
12 12 1.698652 41.642 39.42
13 13 1.694204 41.896 39.41
14 14 1.693281 42.072 38.92
15 15 1.690843 42.102 39.50
16 16 1.690079 41.898 39.71
17 17 1.687246 42.150 36.56
18 18 1.687299 42.234 39.19
19 19 1.684231 42.202 38.91
20 20 1.682746 42.182 39.32
21 21 1.680615 42.518 39.09
22 22 1.680969 42.390 39.30
23 23 1.679326 42.452 38.67
24 24 1.677406 42.404 38.70
25 25 1.675402 42.602 38.38
26 26 1.673886 42.528 38.02
27 27 1.674442 42.440 38.56
28 28 1.672919 42.840 38.91
29 29 1.673361 42.752 37.89
30 30 1.670308 42.724 38.54
31 31 1.668860 42.746 38.56
32 32 1.667818 42.924 36.61
33 33 1.665516 43.008 39.47
34 34 1.666094 43.120 38.23
35 35 1.666965 42.832 38.43
36 36 1.665639 42.974 38.23
37 37 1.665473 42.936 38.44
38 38 1.663517 43.022 38.09
39 39 1.660996 43.140 37.84
40 40 1.661804 42.920 38.62
41 41 1.661093 43.092 39.73
42 42 1.661959 43.072 39.15
43 43 1.658829 43.122 37.86
44 44 1.657510 43.170 38.24
45 45 1.658654 43.292 38.84
46 46 1.658729 43.208 37.83
47 47 1.657651 43.232 38.01
48 48 1.656080 43.266 38.73
49 49 1.655956 43.416 39.07
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/3aa.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/3bb.png')
Out[ ]:

(4)¶

Change the code by adding two convolutional layers along with maxpooling layers before the fully connected layers. This will be similar to the example in the tutorial. Use this model for the following sections.

Answer:

For this part, I decided to use only two fully connected layers instead of three fully connected layers used in the tutorial. I also decided to change padding, output channels, output features, and filters. Based on the results shown below, it appears that this model yields slightly higher accuracy than the the one used in the tutorial.

In [ ]:
#use the same data loader and batch size as part 2 
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
  cat   cat   dog  frog
2022-12-01T10:45:09.074762 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
#change padding, output channels, output features, and filters 
#change to only 2 fully connected layers to also see the effect of number of fully connected layers 
class MultiNet(nn.Module):
    def __init__(self):
        super(MultiNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(32, 16, 3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(2, stride=2)
        self.fc1 = nn.Linear(16 * 8 * 8, 64)
        self.fc2 = nn.Linear(64, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1,16 * 8 * 8)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x


MultiNet = MultiNet().to(device)
print(MultiNet)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(MultiNet.parameters(), lr=0.0002, momentum=0.9) #change learning rate 
MultiNet(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=1024, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=10, bias=True)
)
In [ ]:
#print number of parameters 
total = 0
print('Trainable parameters:')
for name, param in MultiNet.named_parameters():
    if param.requires_grad:
        print(name, '\t', param.numel())
        total += param.numel()
print()
print('Total', '\t', total)
Trainable parameters:
conv1.weight 	 864
conv1.bias 	 32
conv2.weight 	 4608
conv2.bias 	 16
fc1.weight 	 65536
fc1.bias 	 64
fc2.weight 	 640
fc2.bias 	 10

Total 	 71770
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)
            # for i in range(4):
            #     label = labels[i]
            #     class_correct[label] += c[i].item()
            #     class_total[label] += 1
    # for i in range(10):
    #     print('Accuracy of %5s : %2d %%' % (
    #     classes[i], 100 * class_correct[i] / class_total[i]))

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 1.8780543034706547 | train accuracy: 32.146 | test accuracy: 44.02 
*** Epoch 1 ***
loss: 1.416798793092395 | train accuracy: 49.07 | test accuracy: 52.27 
*** Epoch 2 ***
loss: 1.267636388186188 | train accuracy: 54.958 | test accuracy: 56.8 
*** Epoch 3 ***
loss: 1.1584952184016184 | train accuracy: 59.014 | test accuracy: 60.48 
*** Epoch 4 ***
loss: 1.0609709163438474 | train accuracy: 62.718 | test accuracy: 62.56 
*** Epoch 5 ***
loss: 0.9825279659819844 | train accuracy: 65.39 | test accuracy: 64.96 
*** Epoch 6 ***
loss: 0.9206138926980901 | train accuracy: 67.648 | test accuracy: 65.09 
*** Epoch 7 ***
loss: 0.8718484467065832 | train accuracy: 69.508 | test accuracy: 66.23 
*** Epoch 8 ***
loss: 0.8281666102723546 | train accuracy: 71.086 | test accuracy: 67.15 
*** Epoch 9 ***
loss: 0.7911792582180597 | train accuracy: 72.312 | test accuracy: 68.87 
*** Epoch 10 ***
loss: 0.759802233920025 | train accuracy: 73.456 | test accuracy: 68.32 
*** Epoch 11 ***
loss: 0.7251949095232529 | train accuracy: 74.57 | test accuracy: 69.29 
*** Epoch 12 ***
loss: 0.6981547271744117 | train accuracy: 75.77 | test accuracy: 68.23 
*** Epoch 13 ***
loss: 0.6679169682529399 | train accuracy: 76.608 | test accuracy: 69.52 
*** Epoch 14 ***
loss: 0.6444989167444729 | train accuracy: 77.406 | test accuracy: 69.4 
*** Epoch 15 ***
loss: 0.6168408014060068 | train accuracy: 78.248 | test accuracy: 69.09 
*** Epoch 16 ***
loss: 0.5926428593394372 | train accuracy: 79.162 | test accuracy: 68.86 
*** Epoch 17 ***
loss: 0.5699344410174253 | train accuracy: 80.084 | test accuracy: 69.18 
*** Epoch 18 ***
loss: 0.5488273560683237 | train accuracy: 80.702 | test accuracy: 69.29 
*** Epoch 19 ***
loss: 0.5292396302127701 | train accuracy: 81.424 | test accuracy: 69.04 
*** Epoch 20 ***
loss: 0.5076650357912094 | train accuracy: 81.97 | test accuracy: 68.7 
*** Epoch 21 ***
loss: 0.4887016224592172 | train accuracy: 82.57 | test accuracy: 69.33 
*** Epoch 22 ***
loss: 0.4666335650980077 | train accuracy: 83.47 | test accuracy: 68.04 
*** Epoch 23 ***
loss: 0.4527140850413417 | train accuracy: 83.884 | test accuracy: 69.19 
*** Epoch 24 ***
loss: 0.4321576274063681 | train accuracy: 84.548 | test accuracy: 68.88 
*** Epoch 25 ***
loss: 0.41258398511644406 | train accuracy: 85.218 | test accuracy: 68.71 
*** Epoch 26 ***
loss: 0.39739321760525587 | train accuracy: 85.94 | test accuracy: 68.46 
*** Epoch 27 ***
loss: 0.38263335448402463 | train accuracy: 86.28 | test accuracy: 68.46 
*** Epoch 28 ***
loss: 0.3668822811421674 | train accuracy: 86.99 | test accuracy: 67.52 
*** Epoch 29 ***
loss: 0.35120304823100357 | train accuracy: 87.44 | test accuracy: 67.82 
*** Epoch 30 ***
loss: 0.3362968934208356 | train accuracy: 87.992 | test accuracy: 67.41 
*** Epoch 31 ***
loss: 0.32030493423028816 | train accuracy: 88.608 | test accuracy: 67.02 
*** Epoch 32 ***
loss: 0.3106067577500716 | train accuracy: 88.908 | test accuracy: 66.94 
*** Epoch 33 ***
loss: 0.2950502536606819 | train accuracy: 89.426 | test accuracy: 67.33 
*** Epoch 34 ***
loss: 0.2845300995272491 | train accuracy: 89.726 | test accuracy: 67.3 
*** Epoch 35 ***
loss: 0.2717765205955772 | train accuracy: 90.246 | test accuracy: 67.37 
*** Epoch 36 ***
loss: 0.261323597453228 | train accuracy: 90.65 | test accuracy: 67.55 
*** Epoch 37 ***
loss: 0.24697236418324048 | train accuracy: 91.198 | test accuracy: 66.56 
*** Epoch 38 ***
loss: 0.23908215785114037 | train accuracy: 91.472 | test accuracy: 67.04 
*** Epoch 39 ***
loss: 0.22783514340404604 | train accuracy: 91.77 | test accuracy: 66.9 
*** Epoch 40 ***
loss: 0.2180348124407303 | train accuracy: 92.278 | test accuracy: 66.44 
*** Epoch 41 ***
loss: 0.2063170114757499 | train accuracy: 92.602 | test accuracy: 66.71 
*** Epoch 42 ***
loss: 0.2009646180486203 | train accuracy: 92.598 | test accuracy: 66.83 
*** Epoch 43 ***
loss: 0.19543770194048873 | train accuracy: 93.094 | test accuracy: 66.67 
*** Epoch 44 ***
loss: 0.19141478395899947 | train accuracy: 93.104 | test accuracy: 67.18 
*** Epoch 45 ***
loss: 0.18583327076824283 | train accuracy: 93.242 | test accuracy: 66.59 
*** Epoch 46 ***
loss: 0.17071196161216376 | train accuracy: 93.928 | test accuracy: 66.38 
*** Epoch 47 ***
loss: 0.16734584174914352 | train accuracy: 93.968 | test accuracy: 66.4 
*** Epoch 48 ***
loss: 0.15693258837214855 | train accuracy: 94.362 | test accuracy: 66.29 
*** Epoch 49 ***
loss: 0.16054843534149088 | train accuracy: 94.17 | test accuracy: 64.87 
CPU times: total: 1h 18min 35s
Wall time: 48min 27s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 1.878054 32.146 44.02
1 1 1.416799 49.070 52.27
2 2 1.267636 54.958 56.80
3 3 1.158495 59.014 60.48
4 4 1.060971 62.718 62.56
5 5 0.982528 65.390 64.96
6 6 0.920614 67.648 65.09
7 7 0.871848 69.508 66.23
8 8 0.828167 71.086 67.15
9 9 0.791179 72.312 68.87
10 10 0.759802 73.456 68.32
11 11 0.725195 74.570 69.29
12 12 0.698155 75.770 68.23
13 13 0.667917 76.608 69.52
14 14 0.644499 77.406 69.40
15 15 0.616841 78.248 69.09
16 16 0.592643 79.162 68.86
17 17 0.569934 80.084 69.18
18 18 0.548827 80.702 69.29
19 19 0.529240 81.424 69.04
20 20 0.507665 81.970 68.70
21 21 0.488702 82.570 69.33
22 22 0.466634 83.470 68.04
23 23 0.452714 83.884 69.19
24 24 0.432158 84.548 68.88
25 25 0.412584 85.218 68.71
26 26 0.397393 85.940 68.46
27 27 0.382633 86.280 68.46
28 28 0.366882 86.990 67.52
29 29 0.351203 87.440 67.82
30 30 0.336297 87.992 67.41
31 31 0.320305 88.608 67.02
32 32 0.310607 88.908 66.94
33 33 0.295050 89.426 67.33
34 34 0.284530 89.726 67.30
35 35 0.271777 90.246 67.37
36 36 0.261324 90.650 67.55
37 37 0.246972 91.198 66.56
38 38 0.239082 91.472 67.04
39 39 0.227835 91.770 66.90
40 40 0.218035 92.278 66.44
41 41 0.206317 92.602 66.71
42 42 0.200965 92.598 66.83
43 43 0.195438 93.094 66.67
44 44 0.191415 93.104 67.18
45 45 0.185833 93.242 66.59
46 46 0.170712 93.928 66.38
47 47 0.167346 93.968 66.40
48 48 0.156933 94.362 66.29
49 49 0.160548 94.170 64.87
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/4a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/4b.png')
Out[ ]:

(5)¶

Try multiple batch sizes to see the effect and describe the findings. Please use batch size of 1, 4, and

  1. If 1000 does not fit into the memory of your machine, please feel free to reduce it to the largest possible number.

Answer:

Using the model I defined in part (4) above, batch size 1 appears to perform better than expected. For batch sizes 10, 100 and 1000, it appears that there is some overfitting since the training accuracy is so much higher compare to the test accuracy.

Batch size = 4 (in part 4 above)¶

Batch size = 1¶

In [ ]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=1,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=1,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(1)))
Files already downloaded and verified
Files already downloaded and verified
  dog
2022-12-01T18:33:10.708289 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(15): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 0.7999621010004819 | train accuracy: 73.08 | test accuracy: 64.38 
*** Epoch 1 ***
loss: 0.5851293112842335 | train accuracy: 79.0 | test accuracy: 63.89 
*** Epoch 2 ***
loss: 0.5306400444689515 | train accuracy: 80.922 | test accuracy: 63.9 
*** Epoch 3 ***
loss: 0.5094365636474008 | train accuracy: 81.872 | test accuracy: 64.25 
*** Epoch 4 ***
loss: 0.478789111002054 | train accuracy: 82.498 | test accuracy: 65.43 
*** Epoch 5 ***
loss: 0.4579591372638836 | train accuracy: 83.578 | test accuracy: 65.5 
*** Epoch 6 ***
loss: 0.44394022266802385 | train accuracy: 84.092 | test accuracy: 64.34 
*** Epoch 7 ***
loss: 0.42938015151365866 | train accuracy: 84.472 | test accuracy: 65.14 
*** Epoch 8 ***
loss: 0.41947313669005976 | train accuracy: 85.14 | test accuracy: 65.03 
*** Epoch 9 ***
loss: 0.4081550629241809 | train accuracy: 85.49 | test accuracy: 65.47 
*** Epoch 10 ***
loss: 0.3954313142016223 | train accuracy: 85.938 | test accuracy: 65.05 
*** Epoch 11 ***
loss: 0.39222712977315966 | train accuracy: 86.164 | test accuracy: 64.02 
*** Epoch 12 ***
loss: 0.3793914342707322 | train accuracy: 86.536 | test accuracy: 64.05 
*** Epoch 13 ***
loss: 0.3667076674663083 | train accuracy: 87.026 | test accuracy: 63.81 
*** Epoch 14 ***
loss: 0.36099897545381227 | train accuracy: 87.204 | test accuracy: 64.03 
CPU times: total: 1h 1min 14s
Wall time: 37min 20s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 0.799962 73.080 64.38
1 1 0.585129 79.000 63.89
2 2 0.530640 80.922 63.90
3 3 0.509437 81.872 64.25
4 4 0.478789 82.498 65.43
5 5 0.457959 83.578 65.50
6 6 0.443940 84.092 64.34
7 7 0.429380 84.472 65.14
8 8 0.419473 85.140 65.03
9 9 0.408155 85.490 65.47
10 10 0.395431 85.938 65.05
11 11 0.392227 86.164 64.02
12 12 0.379391 86.536 64.05
13 13 0.366708 87.026 63.81
14 14 0.360999 87.204 64.03
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-1a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-1b.png')
Out[ ]:

Batch size: 10¶

In [ ]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=10,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=10,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')



# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(10)))
Files already downloaded and verified
Files already downloaded and verified
 frog   dog   cat  deer   dog  frog   cat  deer horse  bird
2022-12-01T18:34:15.044446 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 0.08677877117384591 | train accuracy: 97.412 | test accuracy: 67.5 
*** Epoch 1 ***
loss: 0.07494641703175522 | train accuracy: 97.858 | test accuracy: 67.24 
*** Epoch 2 ***
loss: 0.06609822394390862 | train accuracy: 98.23 | test accuracy: 67.0 
*** Epoch 3 ***
loss: 0.059117207788701005 | train accuracy: 98.532 | test accuracy: 67.17 
*** Epoch 4 ***
loss: 0.05337124418276868 | train accuracy: 98.718 | test accuracy: 66.91 
*** Epoch 5 ***
loss: 0.04842699483783084 | train accuracy: 98.912 | test accuracy: 66.99 
*** Epoch 6 ***
loss: 0.04425891932012715 | train accuracy: 99.044 | test accuracy: 67.03 
*** Epoch 7 ***
loss: 0.04052997631697763 | train accuracy: 99.194 | test accuracy: 67.09 
*** Epoch 8 ***
loss: 0.037180889356642946 | train accuracy: 99.296 | test accuracy: 66.91 
*** Epoch 9 ***
loss: 0.03430292609681459 | train accuracy: 99.408 | test accuracy: 67.0 
*** Epoch 10 ***
loss: 0.031635014660143804 | train accuracy: 99.476 | test accuracy: 66.85 
*** Epoch 11 ***
loss: 0.029228499758009344 | train accuracy: 99.558 | test accuracy: 67.04 
*** Epoch 12 ***
loss: 0.02714797970520349 | train accuracy: 99.612 | test accuracy: 66.84 
*** Epoch 13 ***
loss: 0.025215152020371903 | train accuracy: 99.674 | test accuracy: 66.82 
*** Epoch 14 ***
loss: 0.023482588567530598 | train accuracy: 99.702 | test accuracy: 66.93 
*** Epoch 15 ***
loss: 0.021937300591473304 | train accuracy: 99.748 | test accuracy: 66.89 
*** Epoch 16 ***
loss: 0.02041508857143922 | train accuracy: 99.792 | test accuracy: 66.86 
*** Epoch 17 ***
loss: 0.019122144427952396 | train accuracy: 99.81 | test accuracy: 66.78 
*** Epoch 18 ***
loss: 0.017965763969019655 | train accuracy: 99.848 | test accuracy: 66.91 
*** Epoch 19 ***
loss: 0.016799643223341182 | train accuracy: 99.862 | test accuracy: 66.92 
*** Epoch 20 ***
loss: 0.015842499499776758 | train accuracy: 99.874 | test accuracy: 66.81 
*** Epoch 21 ***
loss: 0.014904507843230096 | train accuracy: 99.88 | test accuracy: 66.85 
*** Epoch 22 ***
loss: 0.014058254682025574 | train accuracy: 99.894 | test accuracy: 66.89 
*** Epoch 23 ***
loss: 0.013293195938719587 | train accuracy: 99.91 | test accuracy: 66.86 
*** Epoch 24 ***
loss: 0.012604429389067175 | train accuracy: 99.914 | test accuracy: 66.91 
*** Epoch 25 ***
loss: 0.01193563584140088 | train accuracy: 99.918 | test accuracy: 66.8 
*** Epoch 26 ***
loss: 0.011326101611970122 | train accuracy: 99.934 | test accuracy: 66.85 
*** Epoch 27 ***
loss: 0.010780051260866058 | train accuracy: 99.932 | test accuracy: 66.7 
*** Epoch 28 ***
loss: 0.010256169706183522 | train accuracy: 99.946 | test accuracy: 66.75 
*** Epoch 29 ***
loss: 0.009751417356587766 | train accuracy: 99.946 | test accuracy: 66.8 
*** Epoch 30 ***
loss: 0.009307469172809632 | train accuracy: 99.95 | test accuracy: 66.88 
*** Epoch 31 ***
loss: 0.008869691596371939 | train accuracy: 99.962 | test accuracy: 66.87 
*** Epoch 32 ***
loss: 0.008487882850262364 | train accuracy: 99.958 | test accuracy: 66.86 
*** Epoch 33 ***
loss: 0.008120872371867925 | train accuracy: 99.966 | test accuracy: 66.95 
*** Epoch 34 ***
loss: 0.007778769270117215 | train accuracy: 99.968 | test accuracy: 66.87 
*** Epoch 35 ***
loss: 0.007466513092990287 | train accuracy: 99.972 | test accuracy: 66.7 
*** Epoch 36 ***
loss: 0.007145348367622949 | train accuracy: 99.974 | test accuracy: 66.83 
*** Epoch 37 ***
loss: 0.0068678626692752105 | train accuracy: 99.974 | test accuracy: 66.82 
*** Epoch 38 ***
loss: 0.0065996848505223155 | train accuracy: 99.976 | test accuracy: 66.82 
*** Epoch 39 ***
loss: 0.006355647767645485 | train accuracy: 99.978 | test accuracy: 66.81 
*** Epoch 40 ***
loss: 0.006113466286945887 | train accuracy: 99.982 | test accuracy: 66.84 
*** Epoch 41 ***
loss: 0.00587685851298834 | train accuracy: 99.984 | test accuracy: 66.89 
*** Epoch 42 ***
loss: 0.005665412352482217 | train accuracy: 99.984 | test accuracy: 66.76 
*** Epoch 43 ***
loss: 0.005467527841215791 | train accuracy: 99.984 | test accuracy: 66.83 
*** Epoch 44 ***
loss: 0.005286200925406293 | train accuracy: 99.984 | test accuracy: 66.94 
*** Epoch 45 ***
loss: 0.005111358509622431 | train accuracy: 99.986 | test accuracy: 66.74 
*** Epoch 46 ***
loss: 0.004932092295850021 | train accuracy: 99.988 | test accuracy: 66.84 
*** Epoch 47 ***
loss: 0.004778390727203582 | train accuracy: 99.99 | test accuracy: 66.97 
*** Epoch 48 ***
loss: 0.004627896831845188 | train accuracy: 99.994 | test accuracy: 66.81 
*** Epoch 49 ***
loss: 0.004470903200323206 | train accuracy: 99.994 | test accuracy: 66.85 
CPU times: total: 1h 14min 23s
Wall time: 40min 58s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 0.086779 97.412 67.50
1 1 0.074946 97.858 67.24
2 2 0.066098 98.230 67.00
3 3 0.059117 98.532 67.17
4 4 0.053371 98.718 66.91
5 5 0.048427 98.912 66.99
6 6 0.044259 99.044 67.03
7 7 0.040530 99.194 67.09
8 8 0.037181 99.296 66.91
9 9 0.034303 99.408 67.00
10 10 0.031635 99.476 66.85
11 11 0.029228 99.558 67.04
12 12 0.027148 99.612 66.84
13 13 0.025215 99.674 66.82
14 14 0.023483 99.702 66.93
15 15 0.021937 99.748 66.89
16 16 0.020415 99.792 66.86
17 17 0.019122 99.810 66.78
18 18 0.017966 99.848 66.91
19 19 0.016800 99.862 66.92
20 20 0.015842 99.874 66.81
21 21 0.014905 99.880 66.85
22 22 0.014058 99.894 66.89
23 23 0.013293 99.910 66.86
24 24 0.012604 99.914 66.91
25 25 0.011936 99.918 66.80
26 26 0.011326 99.934 66.85
27 27 0.010780 99.932 66.70
28 28 0.010256 99.946 66.75
29 29 0.009751 99.946 66.80
30 30 0.009307 99.950 66.88
31 31 0.008870 99.962 66.87
32 32 0.008488 99.958 66.86
33 33 0.008121 99.966 66.95
34 34 0.007779 99.968 66.87
35 35 0.007467 99.972 66.70
36 36 0.007145 99.974 66.83
37 37 0.006868 99.974 66.82
38 38 0.006600 99.976 66.82
39 39 0.006356 99.978 66.81
40 40 0.006113 99.982 66.84
41 41 0.005877 99.984 66.89
42 42 0.005665 99.984 66.76
43 43 0.005468 99.984 66.83
44 44 0.005286 99.984 66.94
45 45 0.005111 99.986 66.74
46 46 0.004932 99.988 66.84
47 47 0.004778 99.990 66.97
48 48 0.004628 99.994 66.81
49 49 0.004471 99.994 66.85
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-10a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-10b.png')
Out[ ]:

Batch size: 100¶

In [ ]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=100,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=100,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(100)))
Files already downloaded and verified
Files already downloaded and verified
 ship truck  deer plane  frog   dog horse horse   car  bird   dog horse  ship  deer   car  bird   dog  ship truck   cat   car   car   car  ship  ship  deer  frog   dog  ship   dog   car   cat  bird  deer horse   car truck   dog plane  frog  deer plane horse   dog horse  bird truck horse   car horse  deer   car  frog  bird  frog horse truck  frog   dog horse  deer  deer  ship  deer   car   car  frog   dog  frog   car   dog  bird  bird plane   cat  ship  frog  frog truck  ship   car truck  ship   car   car  bird   dog   car   car plane   car  bird  bird   car  deer truck  frog  bird  frog   cat
2022-12-01T18:34:38.236936 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 0.004217426573021528 | train accuracy: 99.994 | test accuracy: 66.84 
*** Epoch 1 ***
loss: 0.004183346701912836 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 2 ***
loss: 0.004166601598776041 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 3 ***
loss: 0.0041522467938179 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 4 ***
loss: 0.004139079320158638 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 5 ***
loss: 0.004125757714246356 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 6 ***
loss: 0.004113289125390605 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 7 ***
loss: 0.004100587565679559 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 8 ***
loss: 0.004088359416223585 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 9 ***
loss: 0.00407578945505192 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 10 ***
loss: 0.004064789598990598 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 11 ***
loss: 0.004052899435408906 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 12 ***
loss: 0.004040958351905426 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 13 ***
loss: 0.0040287479353725225 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 14 ***
loss: 0.004016337832967883 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 15 ***
loss: 0.004005093125856009 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 16 ***
loss: 0.003992846616140692 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 17 ***
loss: 0.003981510618186457 | train accuracy: 99.994 | test accuracy: 66.86 
*** Epoch 18 ***
loss: 0.003970838622665985 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 19 ***
loss: 0.003958750986503456 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 20 ***
loss: 0.003947201893918724 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 21 ***
loss: 0.003936428884426897 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 22 ***
loss: 0.00392468928621675 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 23 ***
loss: 0.003913501795485825 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 24 ***
loss: 0.003902188795273433 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 25 ***
loss: 0.003891484769208652 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 26 ***
loss: 0.003880393562580875 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 27 ***
loss: 0.0038696437342025207 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 28 ***
loss: 0.0038584368354743553 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 29 ***
loss: 0.0038474390144648406 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 30 ***
loss: 0.003837098329251479 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 31 ***
loss: 0.003826374102328794 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 32 ***
loss: 0.00381512552563228 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 33 ***
loss: 0.003805126336250148 | train accuracy: 99.994 | test accuracy: 66.95 
*** Epoch 34 ***
loss: 0.0037948153206130457 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 35 ***
loss: 0.003784263476669669 | train accuracy: 99.994 | test accuracy: 66.85 
*** Epoch 36 ***
loss: 0.0037725391624614836 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 37 ***
loss: 0.0037627246782355353 | train accuracy: 99.994 | test accuracy: 66.88 
*** Epoch 38 ***
loss: 0.0037525039654269904 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 39 ***
loss: 0.0037416294695369018 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 40 ***
loss: 0.0037321744590273543 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 41 ***
loss: 0.0037218219687404457 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 42 ***
loss: 0.0037107723986707585 | train accuracy: 99.994 | test accuracy: 66.86 
*** Epoch 43 ***
loss: 0.00370158749107152 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 44 ***
loss: 0.0036916484758127218 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 45 ***
loss: 0.0036808705855093764 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 46 ***
loss: 0.003670884357188606 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 47 ***
loss: 0.003661387741849394 | train accuracy: 99.994 | test accuracy: 66.9 
*** Epoch 48 ***
loss: 0.0036509871979305644 | train accuracy: 99.994 | test accuracy: 66.87 
*** Epoch 49 ***
loss: 0.0036415445727047946 | train accuracy: 99.994 | test accuracy: 66.9 
CPU times: total: 1h 7min 49s
Wall time: 25min 37s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 0.004217 99.994 66.84
1 1 0.004183 99.994 66.88
2 2 0.004167 99.994 66.87
3 3 0.004152 99.994 66.88
4 4 0.004139 99.994 66.89
5 5 0.004126 99.994 66.88
6 6 0.004113 99.994 66.88
7 7 0.004101 99.994 66.90
8 8 0.004088 99.994 66.90
9 9 0.004076 99.994 66.90
10 10 0.004065 99.994 66.89
11 11 0.004053 99.994 66.90
12 12 0.004041 99.994 66.89
13 13 0.004029 99.994 66.89
14 14 0.004016 99.994 66.87
15 15 0.004005 99.994 66.87
16 16 0.003993 99.994 66.88
17 17 0.003982 99.994 66.86
18 18 0.003971 99.994 66.88
19 19 0.003959 99.994 66.87
20 20 0.003947 99.994 66.89
21 21 0.003936 99.994 66.87
22 22 0.003925 99.994 66.89
23 23 0.003914 99.994 66.88
24 24 0.003902 99.994 66.87
25 25 0.003891 99.994 66.89
26 26 0.003880 99.994 66.89
27 27 0.003870 99.994 66.87
28 28 0.003858 99.994 66.89
29 29 0.003847 99.994 66.88
30 30 0.003837 99.994 66.89
31 31 0.003826 99.994 66.87
32 32 0.003815 99.994 66.91
33 33 0.003805 99.994 66.95
34 34 0.003795 99.994 66.90
35 35 0.003784 99.994 66.85
36 36 0.003773 99.994 66.91
37 37 0.003763 99.994 66.88
38 38 0.003753 99.994 66.90
39 39 0.003742 99.994 66.87
40 40 0.003732 99.994 66.87
41 41 0.003722 99.994 66.90
42 42 0.003711 99.994 66.86
43 43 0.003702 99.994 66.89
44 44 0.003692 99.994 66.87
45 45 0.003681 99.994 66.87
46 46 0.003671 99.994 66.90
47 47 0.003661 99.994 66.90
48 48 0.003651 99.994 66.87
49 49 0.003642 99.994 66.90
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-100a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-100b.png')
Out[ ]:

Batch size: 1000¶

In [ ]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=1000,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=1000,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')



# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(1000)))
Files already downloaded and verified
Files already downloaded and verified
plane   cat truck   cat plane   car  bird  bird   cat   car   dog  frog horse   car  frog horse plane truck plane truck plane   cat   dog truck horse  bird plane  deer plane   dog horse  bird   car  frog horse  frog   dog horse  bird   car  deer truck  frog horse   dog  ship truck  ship   dog truck plane   cat horse  frog plane  deer   car horse  frog   cat   cat  frog plane  deer   dog plane  deer  ship  deer  frog horse  deer   cat truck  deer truck plane plane   dog   car horse  frog truck  bird  deer horse  deer plane   dog plane   cat   dog plane truck plane   car horse horse truck truck truck horse   dog  deer plane plane plane   car horse  ship horse  deer   dog  ship  frog  frog horse   car plane   cat   dog  frog  frog   car   car   dog  frog horse  bird  ship  bird plane  ship   dog  deer  frog horse  bird  frog horse  ship  deer   cat   dog  ship   car  frog  deer   car horse   dog horse horse   car  bird  bird   cat horse horse  ship   cat   dog plane   dog horse  ship  deer   cat  ship   car  frog   dog truck plane horse truck   cat plane  ship   car  ship  deer  ship   dog truck truck  frog horse   cat plane horse  deer horse   car truck   dog  frog   cat truck plane  frog   cat truck   car   car  bird   cat truck  deer  bird  ship  bird plane  bird   dog  ship   dog   cat   car   cat  frog  frog   car  ship  frog plane  ship   cat   car  deer   car  bird  ship horse  deer truck   dog  frog  ship  frog plane  deer  bird  frog   dog plane  frog  ship   dog  frog truck   car   car  bird  deer   car   car   cat  deer  ship plane plane   car truck   dog  ship   dog   dog truck  frog  ship plane  bird truck truck plane   car truck  deer  frog   cat   car  ship  frog plane  deer plane   dog  bird  bird horse  frog   cat horse  ship truck   car   dog plane  frog  bird   dog  frog plane   cat truck   car  deer   car   car   car   dog  deer   car  frog  ship   car   dog   cat  bird plane   cat horse  bird truck   car   dog plane horse  bird  frog  frog   cat   cat plane   dog  ship plane   cat   cat   cat  bird horse  bird truck plane  bird  bird  frog   cat horse truck horse   dog   car truck   car   car  ship   dog  ship  frog   cat plane   dog  frog horse   car  bird truck  bird  ship   dog  frog  frog   car plane   cat   car   cat   car   cat  bird  frog horse  ship horse plane  ship  deer   cat  frog  deer  ship   cat  deer horse horse  deer  ship   car horse plane  ship horse truck  frog   car plane plane   car horse   dog horse   cat   car  ship truck truck  deer plane   dog  bird horse   dog  ship   car  deer   cat   dog horse plane horse   cat truck plane plane  frog   cat plane   cat horse horse  ship plane  frog  ship  frog truck horse   dog  deer   car   car  deer plane   dog horse   cat  ship  ship  ship truck   cat   car  ship   dog   cat  deer   cat  ship  ship   cat  deer  frog   car horse truck   cat plane plane  ship  bird   car plane  frog   cat plane   dog   car truck   dog  bird plane plane  frog truck  bird  frog   car   dog plane   car   car plane   cat plane   cat   car  deer  frog   dog  deer horse  frog  bird   car plane horse   cat horse   cat  ship truck   dog   dog   cat   car horse   dog horse truck  deer plane   dog  ship   car  frog horse  frog  deer  ship  bird   car truck plane truck horse truck  frog  deer plane   cat  bird horse   cat truck  deer plane   dog   dog   car   dog  bird  bird   cat   dog   cat  ship  bird  frog  frog horse plane  bird   cat  ship  bird   cat   car  deer  deer  ship  ship   dog truck  deer   dog plane  bird  ship   cat  deer truck   car  deer horse  bird truck horse  ship truck   dog  frog   cat horse truck plane   cat plane   dog   car   car   car   cat truck  ship   dog  ship plane  frog  deer   car plane   dog   dog   cat  bird horse  bird truck horse   dog  deer   dog truck  ship plane   dog truck  ship   dog  bird  deer truck  frog   dog   cat   cat truck horse  deer  frog   dog   car truck plane  deer  ship truck horse horse plane   cat   cat  bird  bird   dog  deer   car truck  frog  bird  ship   car truck   dog plane   car horse horse  ship plane   dog  bird truck   car plane  ship  deer horse  frog  frog horse  bird  ship   car truck   dog   car truck truck horse truck  frog   dog plane truck  deer plane  ship  ship   car  deer  ship  deer   cat  deer  deer horse  ship  deer  ship plane  bird   car truck  frog   cat   cat   car  ship   cat   dog   car  deer horse plane  deer  deer   cat  deer  deer horse truck   cat  bird   dog   car plane   dog   car   dog  bird  frog   dog  bird   cat truck   cat   dog  frog plane   car   cat horse   car plane  bird   cat  bird plane   car horse  ship truck  bird   dog   cat   car  ship   dog truck  frog truck truck  frog   dog truck horse horse  frog horse   car truck plane horse   cat plane truck plane   car  bird horse   dog  bird   dog   cat horse  deer  frog   cat truck   car  frog   cat plane plane   dog   cat   cat truck  ship horse   car  frog horse plane   dog  bird  frog   cat   dog plane  deer   car  ship  ship   cat   cat horse  deer  bird plane   cat  deer   car   dog  deer   car  frog horse   dog  ship plane  ship  ship  deer  bird  deer  frog   car  frog  ship  frog   car   cat  frog  frog   dog   dog   cat  deer  frog  deer  deer truck   cat  deer truck  ship  frog  ship   dog  bird  deer  frog   cat  ship  frog horse   cat  ship  bird truck   car  deer horse   dog   car   dog  bird truck  frog   cat horse   cat   car truck horse plane   dog  bird  bird   car truck   car truck  ship   car truck   cat  ship horse   car   cat plane  deer   dog plane  bird   cat   cat truck  ship truck   cat  ship plane truck  ship   car plane   car  deer truck plane   dog   car  frog   cat  frog  frog plane  deer   cat   cat   dog   car   dog truck   car   car truck  bird   dog  ship   car   cat  frog  deer horse truck  deer  ship  bird  frog  frog  frog truck  ship truck   dog truck  deer   dog   dog  deer truck  deer  bird  deer horse horse  bird truck  deer  bird  ship truck
2022-12-01T18:35:03.035453 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 0.0036858302227468515 | train accuracy: 99.99400000000001 | test accuracy: 66.9 
*** Epoch 1 ***
loss: 0.0036842829253220435 | train accuracy: 99.99400000000001 | test accuracy: 66.9 
*** Epoch 2 ***
loss: 0.0036829121135251255 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 3 ***
loss: 0.003681747784495962 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 4 ***
loss: 0.0036806825367847874 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 5 ***
loss: 0.0036796916618335005 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 6 ***
loss: 0.0036785499743965206 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 7 ***
loss: 0.003677544602173932 | train accuracy: 99.99399999999997 | test accuracy: 66.89 
*** Epoch 8 ***
loss: 0.0036766199651649414 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 9 ***
loss: 0.003675467529505187 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 10 ***
loss: 0.003674506333333497 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 11 ***
loss: 0.003673524800117831 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 12 ***
loss: 0.0036724066048176313 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 13 ***
loss: 0.0036715363171331734 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 14 ***
loss: 0.0036705349695545677 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 15 ***
loss: 0.003669484318899257 | train accuracy: 99.994 | test accuracy: 66.89999999999999 
*** Epoch 16 ***
loss: 0.003668519421195497 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 17 ***
loss: 0.003667410491604586 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 18 ***
loss: 0.003666548367247594 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 19 ***
loss: 0.0036656232693289617 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 20 ***
loss: 0.0036646320285009487 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 21 ***
loss: 0.0036637636699846815 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 22 ***
loss: 0.0036627442661520777 | train accuracy: 99.994 | test accuracy: 66.89999999999999 
*** Epoch 23 ***
loss: 0.003661679949763478 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 24 ***
loss: 0.0036607488980326727 | train accuracy: 99.99400000000001 | test accuracy: 66.89 
*** Epoch 25 ***
loss: 0.0036596745556714584 | train accuracy: 99.994 | test accuracy: 66.89 
*** Epoch 26 ***
loss: 0.0036586617518748555 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 27 ***
loss: 0.0036577742157161844 | train accuracy: 99.99399999999997 | test accuracy: 66.89999999999999 
*** Epoch 28 ***
loss: 0.003656752963493369 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 29 ***
loss: 0.003655819668985751 | train accuracy: 99.99400000000001 | test accuracy: 66.89999999999999 
*** Epoch 30 ***
loss: 0.003654726519610505 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 31 ***
loss: 0.003653772650476621 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 32 ***
loss: 0.0036528238703553775 | train accuracy: 99.994 | test accuracy: 66.89999999999999 
*** Epoch 33 ***
loss: 0.0036518308096470274 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 34 ***
loss: 0.003650979642585224 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 35 ***
loss: 0.0036498451966564265 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 36 ***
loss: 0.0036488420101909004 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 37 ***
loss: 0.003647864572893904 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 38 ***
loss: 0.0036469922324984657 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 39 ***
loss: 0.003646078486261623 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 40 ***
loss: 0.0036450135379041335 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 41 ***
loss: 0.0036441352959646254 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 42 ***
loss: 0.0036431175694629854 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 43 ***
loss: 0.0036421076023989184 | train accuracy: 99.99399999999997 | test accuracy: 66.91 
*** Epoch 44 ***
loss: 0.0036411139334799076 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 45 ***
loss: 0.0036402127981109886 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 46 ***
loss: 0.003639196497103085 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 47 ***
loss: 0.0036382011556047567 | train accuracy: 99.994 | test accuracy: 66.91 
*** Epoch 48 ***
loss: 0.0036372172323112587 | train accuracy: 99.99400000000001 | test accuracy: 66.91 
*** Epoch 49 ***
loss: 0.0036363103617058725 | train accuracy: 99.994 | test accuracy: 66.91 
CPU times: total: 1h 19min 23s
Wall time: 24min 24s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 0.003686 99.994 66.90
1 1 0.003684 99.994 66.90
2 2 0.003683 99.994 66.89
3 3 0.003682 99.994 66.89
4 4 0.003681 99.994 66.89
5 5 0.003680 99.994 66.89
6 6 0.003679 99.994 66.89
7 7 0.003678 99.994 66.89
8 8 0.003677 99.994 66.89
9 9 0.003675 99.994 66.89
10 10 0.003675 99.994 66.89
11 11 0.003674 99.994 66.89
12 12 0.003672 99.994 66.89
13 13 0.003672 99.994 66.89
14 14 0.003671 99.994 66.89
15 15 0.003669 99.994 66.90
16 16 0.003669 99.994 66.89
17 17 0.003667 99.994 66.90
18 18 0.003667 99.994 66.89
19 19 0.003666 99.994 66.89
20 20 0.003665 99.994 66.90
21 21 0.003664 99.994 66.90
22 22 0.003663 99.994 66.90
23 23 0.003662 99.994 66.90
24 24 0.003661 99.994 66.89
25 25 0.003660 99.994 66.89
26 26 0.003659 99.994 66.90
27 27 0.003658 99.994 66.90
28 28 0.003657 99.994 66.90
29 29 0.003656 99.994 66.90
30 30 0.003655 99.994 66.91
31 31 0.003654 99.994 66.91
32 32 0.003653 99.994 66.90
33 33 0.003652 99.994 66.91
34 34 0.003651 99.994 66.91
35 35 0.003650 99.994 66.91
36 36 0.003649 99.994 66.91
37 37 0.003648 99.994 66.91
38 38 0.003647 99.994 66.91
39 39 0.003646 99.994 66.91
40 40 0.003645 99.994 66.91
41 41 0.003644 99.994 66.91
42 42 0.003643 99.994 66.91
43 43 0.003642 99.994 66.91
44 44 0.003641 99.994 66.91
45 45 0.003640 99.994 66.91
46 46 0.003639 99.994 66.91
47 47 0.003638 99.994 66.91
48 48 0.003637 99.994 66.91
49 49 0.003636 99.994 66.91
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-1000a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/5-1000b.png')
Out[ ]:

(6)¶

Try multiple learning rates to see the effect and describe the findings. Please use learning rates of 10, 0.1, 0.01, and 0.001.

Answer:

Overall, it appears that learning rates 10, 0.1, 0.01, and 0.001 are all too large for this model and data. They all yield very low test and train accuracies (~10%). The runtimes are also not significantly different. Hence, it is best to choose a much smaller learning for this model and dataset.

Learning rate = 10¶

Observation(s): Here, a very large learning rate causes large updates that causes large loss values and possibly infinite values or NaNs. The large learning rate is causing the gradient to explode. However, the very large or infinite values could also be the result of large learning rate and deep network combined. Using learning rate 10, the network becomes unstable so the loss is larger than the ones observed so far and the accuracies are much lower.

In [ ]:
# using batch size = 4 
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')



# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# # show images
imshow(torchvision.utils.make_grid(images))
# # print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Files already downloaded and verified
Files already downloaded and verified
plane   car horse plane
2022-12-01T19:07:58.383954 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
# use the same model as part 4 
# change lr 
optimizer = optim.SGD(MultiNet.parameters(), lr=10, momentum=0.5)
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        #outputs = torch.nan_to_num(outputs)
        #check in outputs are Nan
        check = int((outputs != outputs).sum())
        if(check>0):
            print("data contains Nan")
        #else:
            #print("data does not contain Nan")
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        nn.utils.clip_grad_norm_(MultiNet.parameters(), 1)
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 5.1880405516646 | train accuracy: 9.926 | test accuracy: 10.0 
*** Epoch 1 ***
loss: 5.146548132747066 | train accuracy: 10.056 | test accuracy: 10.0 
*** Epoch 2 ***
loss: 5.196976502854039 | train accuracy: 9.856 | test accuracy: 10.0 
*** Epoch 3 ***
loss: 5.177698532425972 | train accuracy: 10.088 | test accuracy: 10.0 
*** Epoch 4 ***
loss: 5.1488397529062455 | train accuracy: 10.196 | test accuracy: 10.0 
*** Epoch 5 ***
loss: 5.16865594680905 | train accuracy: 10.092 | test accuracy: 10.0 
*** Epoch 6 ***
loss: 5.186575542579547 | train accuracy: 9.76 | test accuracy: 10.0 
*** Epoch 7 ***
loss: 5.1528045862855905 | train accuracy: 10.106 | test accuracy: 10.0 
*** Epoch 8 ***
loss: 5.182451105075505 | train accuracy: 9.964 | test accuracy: 10.0 
*** Epoch 9 ***
loss: 5.166777372942017 | train accuracy: 10.098 | test accuracy: 10.0 
*** Epoch 10 ***
loss: 5.162722996067301 | train accuracy: 10.174 | test accuracy: 10.0 
*** Epoch 11 ***
loss: 5.188010190651792 | train accuracy: 9.892 | test accuracy: 10.0 
*** Epoch 12 ***
loss: 5.145834563240851 | train accuracy: 10.042 | test accuracy: 10.0 
*** Epoch 13 ***
loss: 5.1802641142129 | train accuracy: 10.056 | test accuracy: 10.0 
*** Epoch 14 ***
loss: 5.188762573788038 | train accuracy: 9.992 | test accuracy: 10.0 
*** Epoch 15 ***
loss: 5.178696287509031 | train accuracy: 9.9 | test accuracy: 10.0 
*** Epoch 16 ***
loss: 5.161578253608998 | train accuracy: 10.056 | test accuracy: 10.0 
*** Epoch 17 ***
loss: 5.207314058390796 | train accuracy: 9.974 | test accuracy: 10.0 
*** Epoch 18 ***
loss: 5.140690191100122 | train accuracy: 10.032 | test accuracy: 10.0 
*** Epoch 19 ***
loss: 5.134989613591873 | train accuracy: 9.98 | test accuracy: 10.0 
*** Epoch 20 ***
loss: 5.133775610345889 | train accuracy: 10.018 | test accuracy: 10.0 
*** Epoch 21 ***
loss: 5.1836204086268305 | train accuracy: 9.952 | test accuracy: 10.0 
*** Epoch 22 ***
loss: 5.201752294527644 | train accuracy: 10.06 | test accuracy: 10.0 
*** Epoch 23 ***
loss: 5.154112419945515 | train accuracy: 10.136 | test accuracy: 10.0 
*** Epoch 24 ***
loss: 5.1601564445937385 | train accuracy: 10.236 | test accuracy: 10.0 
*** Epoch 25 ***
loss: 5.18268043982481 | train accuracy: 9.912 | test accuracy: 10.0 
*** Epoch 26 ***
loss: 5.151116521618483 | train accuracy: 9.872 | test accuracy: 10.0 
*** Epoch 27 ***
loss: 5.152553780351756 | train accuracy: 10.182 | test accuracy: 10.0 
*** Epoch 28 ***
loss: 5.15181517335632 | train accuracy: 10.148 | test accuracy: 10.0 
*** Epoch 29 ***
loss: 5.173313327766111 | train accuracy: 10.06 | test accuracy: 10.0 
*** Epoch 30 ***
loss: 5.193359491042838 | train accuracy: 9.664 | test accuracy: 10.0 
*** Epoch 31 ***
loss: 5.187044921111075 | train accuracy: 9.86 | test accuracy: 10.0 
*** Epoch 32 ***
loss: 5.1746605793441125 | train accuracy: 9.922 | test accuracy: 10.0 
*** Epoch 33 ***
loss: 5.18415225523788 | train accuracy: 10.052 | test accuracy: 10.0 
*** Epoch 34 ***
loss: 5.1695518381530015 | train accuracy: 10.086 | test accuracy: 10.0 
*** Epoch 35 ***
loss: 5.189045017731648 | train accuracy: 10.088 | test accuracy: 10.0 
*** Epoch 36 ***
loss: 5.179254347562352 | train accuracy: 10.182 | test accuracy: 10.0 
*** Epoch 37 ***
loss: 5.1745685550240195 | train accuracy: 10.018 | test accuracy: 10.0 
*** Epoch 38 ***
loss: 5.147179800356872 | train accuracy: 10.374 | test accuracy: 10.0 
*** Epoch 39 ***
loss: 5.151303629482619 | train accuracy: 10.02 | test accuracy: 10.0 
*** Epoch 40 ***
loss: 5.178501741606118 | train accuracy: 9.716 | test accuracy: 10.0 
*** Epoch 41 ***
loss: 5.1630637068130065 | train accuracy: 9.982 | test accuracy: 10.0 
*** Epoch 42 ***
loss: 5.157953915312374 | train accuracy: 10.108 | test accuracy: 10.0 
*** Epoch 43 ***
loss: 5.154209447337185 | train accuracy: 9.866 | test accuracy: 10.0 
*** Epoch 44 ***
loss: 5.193471591539651 | train accuracy: 9.532 | test accuracy: 10.0 
*** Epoch 45 ***
loss: 5.159397295341391 | train accuracy: 10.158 | test accuracy: 10.0 
*** Epoch 46 ***
loss: 5.14016592797303 | train accuracy: 10.094 | test accuracy: 10.0 
*** Epoch 47 ***
loss: 5.179231274248609 | train accuracy: 9.876 | test accuracy: 10.0 
*** Epoch 48 ***
loss: 5.095895581268694 | train accuracy: 10.07 | test accuracy: 10.0 
*** Epoch 49 ***
loss: 5.14559622577824 | train accuracy: 10.03 | test accuracy: 10.0 
CPU times: total: 2h 28min 8s
Wall time: 49min 58s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 5.188041 9.926 10.0
1 1 5.146548 10.056 10.0
2 2 5.196977 9.856 10.0
3 3 5.177699 10.088 10.0
4 4 5.148840 10.196 10.0
5 5 5.168656 10.092 10.0
6 6 5.186576 9.760 10.0
7 7 5.152805 10.106 10.0
8 8 5.182451 9.964 10.0
9 9 5.166777 10.098 10.0
10 10 5.162723 10.174 10.0
11 11 5.188010 9.892 10.0
12 12 5.145835 10.042 10.0
13 13 5.180264 10.056 10.0
14 14 5.188763 9.992 10.0
15 15 5.178696 9.900 10.0
16 16 5.161578 10.056 10.0
17 17 5.207314 9.974 10.0
18 18 5.140690 10.032 10.0
19 19 5.134990 9.980 10.0
20 20 5.133776 10.018 10.0
21 21 5.183620 9.952 10.0
22 22 5.201752 10.060 10.0
23 23 5.154112 10.136 10.0
24 24 5.160156 10.236 10.0
25 25 5.182680 9.912 10.0
26 26 5.151117 9.872 10.0
27 27 5.152554 10.182 10.0
28 28 5.151815 10.148 10.0
29 29 5.173313 10.060 10.0
30 30 5.193359 9.664 10.0
31 31 5.187045 9.860 10.0
32 32 5.174661 9.922 10.0
33 33 5.184152 10.052 10.0
34 34 5.169552 10.086 10.0
35 35 5.189045 10.088 10.0
36 36 5.179254 10.182 10.0
37 37 5.174569 10.018 10.0
38 38 5.147180 10.374 10.0
39 39 5.151304 10.020 10.0
40 40 5.178502 9.716 10.0
41 41 5.163064 9.982 10.0
42 42 5.157954 10.108 10.0
43 43 5.154209 9.866 10.0
44 44 5.193472 9.532 10.0
45 45 5.159397 10.158 10.0
46 46 5.140166 10.094 10.0
47 47 5.179231 9.876 10.0
48 48 5.095896 10.070 10.0
49 49 5.145596 10.030 10.0
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()

Learning rate = 0.1¶

In [ ]:
# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
 ship   cat   cat  deer
2022-11-25T22:18:37.953741 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
optimizer = optim.SGD(MultiNet.parameters(), lr=0.1, momentum=0.9)
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        #print(data)
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 2.3609209210026294 | train accuracy: 9.994 | test accuracy: 10.0 
*** Epoch 1 ***
loss: 2.358136475338956 | train accuracy: 10.07 | test accuracy: 10.0 
*** Epoch 2 ***
loss: 2.3609946251621303 | train accuracy: 9.702 | test accuracy: 10.0 
*** Epoch 3 ***
loss: 2.3591029049062815 | train accuracy: 9.946 | test accuracy: 10.0 
*** Epoch 4 ***
loss: 2.3594188737682327 | train accuracy: 9.992 | test accuracy: 10.0 
*** Epoch 5 ***
loss: 2.360642668685681 | train accuracy: 9.8 | test accuracy: 10.0 
*** Epoch 6 ***
loss: 2.35801522317281 | train accuracy: 10.192 | test accuracy: 10.0 
*** Epoch 7 ***
loss: 2.3597723776573694 | train accuracy: 9.794 | test accuracy: 10.0 
*** Epoch 8 ***
loss: 2.360227543779903 | train accuracy: 10.12 | test accuracy: 10.0 
*** Epoch 9 ***
loss: 2.3596781425375357 | train accuracy: 10.038 | test accuracy: 10.0 
*** Epoch 10 ***
loss: 2.3585398669376003 | train accuracy: 9.952 | test accuracy: 10.0 
*** Epoch 11 ***
loss: 2.3586165334846663 | train accuracy: 10.034 | test accuracy: 10.0 
*** Epoch 12 ***
loss: 2.359845554135267 | train accuracy: 9.792 | test accuracy: 10.0 
*** Epoch 13 ***
loss: 2.359876299488114 | train accuracy: 10.004 | test accuracy: 10.0 
*** Epoch 14 ***
loss: 2.3602974320690255 | train accuracy: 9.844 | test accuracy: 10.0 
*** Epoch 15 ***
loss: 2.3573347850364343 | train accuracy: 10.268 | test accuracy: 10.0 
*** Epoch 16 ***
loss: 2.359820234707827 | train accuracy: 9.832 | test accuracy: 10.0 
*** Epoch 17 ***
loss: 2.3571287940736676 | train accuracy: 10.016 | test accuracy: 10.0 
*** Epoch 18 ***
loss: 2.3603160667346947 | train accuracy: 9.844 | test accuracy: 10.0 
*** Epoch 19 ***
loss: 2.360164494941364 | train accuracy: 10.152 | test accuracy: 10.0 
*** Epoch 20 ***
loss: 2.359950891618052 | train accuracy: 9.858 | test accuracy: 10.0 
*** Epoch 21 ***
loss: 2.360635100855752 | train accuracy: 9.828 | test accuracy: 10.0 
*** Epoch 22 ***
loss: 2.35877483615361 | train accuracy: 9.912 | test accuracy: 10.0 
*** Epoch 23 ***
loss: 2.359896816771663 | train accuracy: 9.934 | test accuracy: 10.0 
*** Epoch 24 ***
loss: 2.3596303400511514 | train accuracy: 10.3 | test accuracy: 10.0 
*** Epoch 25 ***
loss: 2.359467768861786 | train accuracy: 10.046 | test accuracy: 10.0 
*** Epoch 26 ***
loss: 2.359252470771316 | train accuracy: 10.172 | test accuracy: 10.0 
*** Epoch 27 ***
loss: 2.3594423730961465 | train accuracy: 9.904 | test accuracy: 10.0 
*** Epoch 28 ***
loss: 2.360807493282553 | train accuracy: 9.988 | test accuracy: 10.0 
*** Epoch 29 ***
loss: 2.358840221032609 | train accuracy: 9.994 | test accuracy: 10.0 
*** Epoch 30 ***
loss: 2.359223269779154 | train accuracy: 9.772 | test accuracy: 10.0 
*** Epoch 31 ***
loss: 2.360605035670157 | train accuracy: 9.886 | test accuracy: 10.0 
*** Epoch 32 ***
loss: 2.361357254155665 | train accuracy: 9.918 | test accuracy: 10.0 
*** Epoch 33 ***
loss: 2.359111694786413 | train accuracy: 10.096 | test accuracy: 10.0 
*** Epoch 34 ***
loss: 2.357495138784229 | train accuracy: 10.18 | test accuracy: 10.0 
*** Epoch 35 ***
loss: 2.3578532528558704 | train accuracy: 10.024 | test accuracy: 10.0 
*** Epoch 36 ***
loss: 2.35896719736198 | train accuracy: 9.906 | test accuracy: 10.0 
*** Epoch 37 ***
loss: 2.3601043400301704 | train accuracy: 9.862 | test accuracy: 10.0 
*** Epoch 38 ***
loss: 2.358733483462994 | train accuracy: 9.88 | test accuracy: 10.0 
*** Epoch 39 ***
loss: 2.3594886594528637 | train accuracy: 10.02 | test accuracy: 10.0 
*** Epoch 40 ***
loss: 2.3614949344091296 | train accuracy: 9.966 | test accuracy: 10.0 
*** Epoch 41 ***
loss: 2.3603575064932043 | train accuracy: 10.058 | test accuracy: 10.0 
*** Epoch 42 ***
loss: 2.360271472627234 | train accuracy: 9.85 | test accuracy: 10.0 
*** Epoch 43 ***
loss: 2.360144206577038 | train accuracy: 10.064 | test accuracy: 10.0 
*** Epoch 44 ***
loss: 2.358330350769492 | train accuracy: 10.032 | test accuracy: 10.0 
*** Epoch 45 ***
loss: 2.3598030739206726 | train accuracy: 10.198 | test accuracy: 10.0 
*** Epoch 46 ***
loss: 2.3619818305080345 | train accuracy: 10.05 | test accuracy: 10.0 
*** Epoch 47 ***
loss: 2.360028318285284 | train accuracy: 9.736 | test accuracy: 10.0 
*** Epoch 48 ***
loss: 2.358494273900356 | train accuracy: 9.91 | test accuracy: 10.0 
*** Epoch 49 ***
loss: 2.359841944475309 | train accuracy: 9.812 | test accuracy: 10.0 
CPU times: total: 2h 40min 4s
Wall time: 1h 5min 29s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 2.360921 9.994 10.0
1 1 2.358136 10.070 10.0
2 2 2.360995 9.702 10.0
3 3 2.359103 9.946 10.0
4 4 2.359419 9.992 10.0
5 5 2.360643 9.800 10.0
6 6 2.358015 10.192 10.0
7 7 2.359772 9.794 10.0
8 8 2.360228 10.120 10.0
9 9 2.359678 10.038 10.0
10 10 2.358540 9.952 10.0
11 11 2.358617 10.034 10.0
12 12 2.359846 9.792 10.0
13 13 2.359876 10.004 10.0
14 14 2.360297 9.844 10.0
15 15 2.357335 10.268 10.0
16 16 2.359820 9.832 10.0
17 17 2.357129 10.016 10.0
18 18 2.360316 9.844 10.0
19 19 2.360164 10.152 10.0
20 20 2.359951 9.858 10.0
21 21 2.360635 9.828 10.0
22 22 2.358775 9.912 10.0
23 23 2.359897 9.934 10.0
24 24 2.359630 10.300 10.0
25 25 2.359468 10.046 10.0
26 26 2.359252 10.172 10.0
27 27 2.359442 9.904 10.0
28 28 2.360807 9.988 10.0
29 29 2.358840 9.994 10.0
30 30 2.359223 9.772 10.0
31 31 2.360605 9.886 10.0
32 32 2.361357 9.918 10.0
33 33 2.359112 10.096 10.0
34 34 2.357495 10.180 10.0
35 35 2.357853 10.024 10.0
36 36 2.358967 9.906 10.0
37 37 2.360104 9.862 10.0
38 38 2.358733 9.880 10.0
39 39 2.359489 10.020 10.0
40 40 2.361495 9.966 10.0
41 41 2.360358 10.058 10.0
42 42 2.360271 9.850 10.0
43 43 2.360144 10.064 10.0
44 44 2.358330 10.032 10.0
45 45 2.359803 10.198 10.0
46 46 2.361982 10.050 10.0
47 47 2.360028 9.736 10.0
48 48 2.358494 9.910 10.0
49 49 2.359842 9.812 10.0
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/6-0.1a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/6-0.1b.png')
Out[ ]:

learning rate = 0.01¶

In [ ]:
# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
truck truck truck plane
2022-11-25T22:22:09.160245 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
optimizer = optim.SGD(MultiNet.parameters(), lr=0.01, momentum=0.9) #change learning rate
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        #print(data)
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 2.3090189374802694 | train accuracy: 9.898 | test accuracy: 10.0 
*** Epoch 1 ***
loss: 2.3084589612513544 | train accuracy: 10.014 | test accuracy: 10.0 
*** Epoch 2 ***
loss: 2.3084058324778898 | train accuracy: 9.876 | test accuracy: 10.0 
*** Epoch 3 ***
loss: 2.308032003896066 | train accuracy: 10.12 | test accuracy: 10.0 
*** Epoch 4 ***
loss: 2.3085325088145416 | train accuracy: 9.926 | test accuracy: 10.0 
*** Epoch 5 ***
loss: 2.3084927449293904 | train accuracy: 10.034 | test accuracy: 10.0 
*** Epoch 6 ***
loss: 2.309020870751844 | train accuracy: 9.896 | test accuracy: 10.0 
*** Epoch 7 ***
loss: 2.308691943337492 | train accuracy: 9.708 | test accuracy: 10.0 
*** Epoch 8 ***
loss: 2.3084211099795393 | train accuracy: 9.862 | test accuracy: 10.0 
*** Epoch 9 ***
loss: 2.30835693499996 | train accuracy: 9.96 | test accuracy: 10.0 
*** Epoch 10 ***
loss: 2.308614597538775 | train accuracy: 9.892 | test accuracy: 10.0 
*** Epoch 11 ***
loss: 2.3084851223255636 | train accuracy: 9.978 | test accuracy: 10.0 
*** Epoch 12 ***
loss: 2.3086142038305204 | train accuracy: 9.682 | test accuracy: 10.0 
*** Epoch 13 ***
loss: 2.3083971644870873 | train accuracy: 9.78 | test accuracy: 10.0 
*** Epoch 14 ***
loss: 2.3085357450048334 | train accuracy: 10.0 | test accuracy: 10.0 
*** Epoch 15 ***
loss: 2.308572929604968 | train accuracy: 9.89 | test accuracy: 10.0 
*** Epoch 16 ***
loss: 2.3085994845018813 | train accuracy: 9.996 | test accuracy: 10.0 
*** Epoch 17 ***
loss: 2.308678487480445 | train accuracy: 9.722 | test accuracy: 10.0 
*** Epoch 18 ***
loss: 2.308368958285431 | train accuracy: 10.004 | test accuracy: 10.0 
*** Epoch 19 ***
loss: 2.3083167906064626 | train accuracy: 9.892 | test accuracy: 10.0 
*** Epoch 20 ***
loss: 2.3084066041002886 | train accuracy: 10.084 | test accuracy: 10.0 
*** Epoch 21 ***
loss: 2.308530491860888 | train accuracy: 9.786 | test accuracy: 10.0 
*** Epoch 22 ***
loss: 2.3082992895115506 | train accuracy: 10.056 | test accuracy: 10.0 
*** Epoch 23 ***
loss: 2.308354253720852 | train accuracy: 9.948 | test accuracy: 10.0 
*** Epoch 24 ***
loss: 2.3082512858123985 | train accuracy: 10.162 | test accuracy: 10.0 
*** Epoch 25 ***
loss: 2.308501202599107 | train accuracy: 9.924 | test accuracy: 10.0 
*** Epoch 26 ***
loss: 2.3082084412555313 | train accuracy: 10.082 | test accuracy: 10.0 
*** Epoch 27 ***
loss: 2.308726716970518 | train accuracy: 9.72 | test accuracy: 10.0 
*** Epoch 28 ***
loss: 2.308231241388868 | train accuracy: 10.06 | test accuracy: 10.0 
*** Epoch 29 ***
loss: 2.3085441546818193 | train accuracy: 10.218 | test accuracy: 10.0 
*** Epoch 30 ***
loss: 2.308279071772458 | train accuracy: 10.058 | test accuracy: 10.0 
*** Epoch 31 ***
loss: 2.308357590856109 | train accuracy: 9.854 | test accuracy: 10.0 
*** Epoch 32 ***
loss: 2.308495979593681 | train accuracy: 9.906 | test accuracy: 10.0 
*** Epoch 33 ***
loss: 2.308245714687998 | train accuracy: 9.916 | test accuracy: 10.0 
*** Epoch 34 ***
loss: 2.308402081414865 | train accuracy: 9.932 | test accuracy: 10.0 
*** Epoch 35 ***
loss: 2.308489034057417 | train accuracy: 10.064 | test accuracy: 10.0 
*** Epoch 36 ***
loss: 2.308178364615391 | train accuracy: 10.092 | test accuracy: 10.0 
*** Epoch 37 ***
loss: 2.3086451826233683 | train accuracy: 9.744 | test accuracy: 10.0 
*** Epoch 38 ***
loss: 2.308743696974243 | train accuracy: 9.838 | test accuracy: 10.0 
*** Epoch 39 ***
loss: 2.3082781109640873 | train accuracy: 10.09 | test accuracy: 10.0 
*** Epoch 40 ***
loss: 2.3083317652159914 | train accuracy: 9.914 | test accuracy: 10.0 
*** Epoch 41 ***
loss: 2.308845790827481 | train accuracy: 9.968 | test accuracy: 10.0 
*** Epoch 42 ***
loss: 2.3085894218606273 | train accuracy: 9.986 | test accuracy: 10.0 
*** Epoch 43 ***
loss: 2.3085321540765373 | train accuracy: 10.102 | test accuracy: 10.0 
*** Epoch 44 ***
loss: 2.3085674261205607 | train accuracy: 9.906 | test accuracy: 10.0 
*** Epoch 45 ***
loss: 2.3083536572642913 | train accuracy: 10.124 | test accuracy: 10.0 
*** Epoch 46 ***
loss: 2.3084600494808076 | train accuracy: 10.156 | test accuracy: 10.0 
*** Epoch 47 ***
loss: 2.3083949037357008 | train accuracy: 9.794 | test accuracy: 10.0 
*** Epoch 48 ***
loss: 2.3085798388033334 | train accuracy: 9.814 | test accuracy: 10.0 
*** Epoch 49 ***
loss: 2.308431481655068 | train accuracy: 9.848 | test accuracy: 10.0 
CPU times: total: 1h 35min 3s
Wall time: 42min 12s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 2.309019 9.898 10.0
1 1 2.308459 10.014 10.0
2 2 2.308406 9.876 10.0
3 3 2.308032 10.120 10.0
4 4 2.308533 9.926 10.0
5 5 2.308493 10.034 10.0
6 6 2.309021 9.896 10.0
7 7 2.308692 9.708 10.0
8 8 2.308421 9.862 10.0
9 9 2.308357 9.960 10.0
10 10 2.308615 9.892 10.0
11 11 2.308485 9.978 10.0
12 12 2.308614 9.682 10.0
13 13 2.308397 9.780 10.0
14 14 2.308536 10.000 10.0
15 15 2.308573 9.890 10.0
16 16 2.308599 9.996 10.0
17 17 2.308678 9.722 10.0
18 18 2.308369 10.004 10.0
19 19 2.308317 9.892 10.0
20 20 2.308407 10.084 10.0
21 21 2.308530 9.786 10.0
22 22 2.308299 10.056 10.0
23 23 2.308354 9.948 10.0
24 24 2.308251 10.162 10.0
25 25 2.308501 9.924 10.0
26 26 2.308208 10.082 10.0
27 27 2.308727 9.720 10.0
28 28 2.308231 10.060 10.0
29 29 2.308544 10.218 10.0
30 30 2.308279 10.058 10.0
31 31 2.308358 9.854 10.0
32 32 2.308496 9.906 10.0
33 33 2.308246 9.916 10.0
34 34 2.308402 9.932 10.0
35 35 2.308489 10.064 10.0
36 36 2.308178 10.092 10.0
37 37 2.308645 9.744 10.0
38 38 2.308744 9.838 10.0
39 39 2.308278 10.090 10.0
40 40 2.308332 9.914 10.0
41 41 2.308846 9.968 10.0
42 42 2.308589 9.986 10.0
43 43 2.308532 10.102 10.0
44 44 2.308567 9.906 10.0
45 45 2.308354 10.124 10.0
46 46 2.308460 10.156 10.0
47 47 2.308395 9.794 10.0
48 48 2.308580 9.814 10.0
49 49 2.308431 9.848 10.0
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/6-0.01a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/6-0.01b.png')
Out[ ]:

Learning rate = 0.001¶

In [ ]:
# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
horse horse horse  frog
2022-11-25T22:25:07.736731 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
optimizer = optim.SGD(MultiNet.parameters(), lr=0.001, momentum=0.9) #change learning rate
In [ ]:
%%time
for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')
[1,  2000] loss: 2.015
[1,  4000] loss: 1.633
[1,  6000] loss: 1.476
[1,  8000] loss: 1.385
[1, 10000] loss: 1.305
[1, 12000] loss: 1.247
[2,  2000] loss: 1.164
[2,  4000] loss: 1.142
[2,  6000] loss: 1.101
[2,  8000] loss: 1.076
[2, 10000] loss: 1.056
[2, 12000] loss: 1.050
Finished Training
CPU times: total: 2min 21s
Wall time: 1min 38s
In [ ]:
dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))


outputs = MultiNet(images)

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = MultiNet(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))
    
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = MultiNet(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))
GroundTruth:    cat  ship  ship plane
Predicted:   frog   car   car plane
Accuracy of the network on the 10000 test images: 63 %
Accuracy of plane : 77 %
Accuracy of   car : 79 %
Accuracy of  bird : 47 %
Accuracy of   cat : 32 %
Accuracy of  deer : 62 %
Accuracy of   dog : 52 %
Accuracy of  frog : 75 %
Accuracy of horse : 70 %
Accuracy of  ship : 69 %
Accuracy of truck : 67 %
2022-11-25T22:27:09.040992 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        #print(data)
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 2.303632252139467 | train accuracy: 9.822 | test accuracy: 10.0 
*** Epoch 1 ***
loss: 2.303463336477013 | train accuracy: 9.976 | test accuracy: 10.0 
*** Epoch 2 ***
loss: 2.3034949693329594 | train accuracy: 9.912 | test accuracy: 10.0 
*** Epoch 3 ***
loss: 2.3034019351568267 | train accuracy: 10.152 | test accuracy: 10.0 
*** Epoch 4 ***
loss: 2.3034346294609085 | train accuracy: 9.774 | test accuracy: 10.0 
*** Epoch 5 ***
loss: 2.3034150030281233 | train accuracy: 9.834 | test accuracy: 10.0 
*** Epoch 6 ***
loss: 2.30351863388863 | train accuracy: 9.932 | test accuracy: 10.0 
*** Epoch 7 ***
loss: 2.3033991097660005 | train accuracy: 9.956 | test accuracy: 10.0 
*** Epoch 8 ***
loss: 2.3033856445240475 | train accuracy: 9.856 | test accuracy: 10.0 
*** Epoch 9 ***
loss: 2.3033937317570246 | train accuracy: 9.918 | test accuracy: 10.0 
*** Epoch 10 ***
loss: 2.3034664522467523 | train accuracy: 10.038 | test accuracy: 10.0 
*** Epoch 11 ***
loss: 2.3034094094715076 | train accuracy: 9.746 | test accuracy: 10.0 
*** Epoch 12 ***
loss: 2.3034548550589062 | train accuracy: 9.816 | test accuracy: 10.0 
*** Epoch 13 ***
loss: 2.303431481320874 | train accuracy: 9.652 | test accuracy: 10.0 
*** Epoch 14 ***
loss: 2.3034265486562107 | train accuracy: 10.058 | test accuracy: 10.0 
*** Epoch 15 ***
loss: 2.303305099502412 | train accuracy: 10.008 | test accuracy: 10.0 
*** Epoch 16 ***
loss: 2.3033588113036285 | train accuracy: 9.984 | test accuracy: 10.0 
*** Epoch 17 ***
loss: 2.303458835079304 | train accuracy: 9.732 | test accuracy: 10.0 
*** Epoch 18 ***
loss: 2.3033278969652473 | train accuracy: 9.99 | test accuracy: 10.0 
*** Epoch 19 ***
loss: 2.3034386644364164 | train accuracy: 9.796 | test accuracy: 10.0 
*** Epoch 20 ***
loss: 2.3033552423688217 | train accuracy: 10.046 | test accuracy: 10.0 
*** Epoch 21 ***
loss: 2.3034744294359832 | train accuracy: 9.86 | test accuracy: 10.0 
*** Epoch 22 ***
loss: 2.3034555171144344 | train accuracy: 9.796 | test accuracy: 10.0 
*** Epoch 23 ***
loss: 2.3033726277356337 | train accuracy: 9.754 | test accuracy: 10.0 
*** Epoch 24 ***
loss: 2.303384363293619 | train accuracy: 9.878 | test accuracy: 10.0 
*** Epoch 25 ***
loss: 2.303367863198129 | train accuracy: 10.108 | test accuracy: 10.0 
*** Epoch 26 ***
loss: 2.3032890009359317 | train accuracy: 10.134 | test accuracy: 10.0 
*** Epoch 27 ***
loss: 2.3034799280944123 | train accuracy: 9.78 | test accuracy: 10.0 
*** Epoch 28 ***
loss: 2.3033633981955433 | train accuracy: 10.092 | test accuracy: 10.0 
*** Epoch 29 ***
loss: 2.303489934674205 | train accuracy: 9.904 | test accuracy: 10.0 
*** Epoch 30 ***
loss: 2.3034078548007777 | train accuracy: 10.01 | test accuracy: 10.0 
*** Epoch 31 ***
loss: 2.3033824466745303 | train accuracy: 9.964 | test accuracy: 10.0 
*** Epoch 32 ***
loss: 2.3033586426614447 | train accuracy: 10.09 | test accuracy: 10.0 
*** Epoch 33 ***
loss: 2.303454263390174 | train accuracy: 9.878 | test accuracy: 10.0 
*** Epoch 34 ***
loss: 2.3033504042399198 | train accuracy: 9.952 | test accuracy: 10.0 
*** Epoch 35 ***
loss: 2.303502630754436 | train accuracy: 9.856 | test accuracy: 10.0 
*** Epoch 36 ***
loss: 2.3035274198965334 | train accuracy: 9.804 | test accuracy: 10.0 
*** Epoch 37 ***
loss: 2.3034799500306766 | train accuracy: 9.622 | test accuracy: 10.0 
*** Epoch 38 ***
loss: 2.3033509020786664 | train accuracy: 9.922 | test accuracy: 10.0 
*** Epoch 39 ***
loss: 2.303332983698785 | train accuracy: 10.048 | test accuracy: 10.0 
*** Epoch 40 ***
loss: 2.3034161023401585 | train accuracy: 9.938 | test accuracy: 10.0 
*** Epoch 41 ***
loss: 2.3034647427632224 | train accuracy: 10.212 | test accuracy: 10.0 
*** Epoch 42 ***
loss: 2.3033343288305046 | train accuracy: 10.13 | test accuracy: 10.0 
*** Epoch 43 ***
loss: 2.3034422182256638 | train accuracy: 9.878 | test accuracy: 10.0 
*** Epoch 44 ***
loss: 2.303477764644664 | train accuracy: 9.854 | test accuracy: 10.0 
*** Epoch 45 ***
loss: 2.303334215391406 | train accuracy: 10.044 | test accuracy: 10.0 
*** Epoch 46 ***
loss: 2.303436020620633 | train accuracy: 9.952 | test accuracy: 10.0 
*** Epoch 47 ***
loss: 2.3034371033374077 | train accuracy: 9.856 | test accuracy: 10.0 
*** Epoch 48 ***
loss: 2.3034528063262822 | train accuracy: 10.038 | test accuracy: 10.0 
*** Epoch 49 ***
loss: 2.3034134849623418 | train accuracy: 10.086 | test accuracy: 10.0 
CPU times: total: 1h 40min 34s
Wall time: 42min 55s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 2.303632 9.822 10.0
1 1 2.303463 9.976 10.0
2 2 2.303495 9.912 10.0
3 3 2.303402 10.152 10.0
4 4 2.303435 9.774 10.0
5 5 2.303415 9.834 10.0
6 6 2.303519 9.932 10.0
7 7 2.303399 9.956 10.0
8 8 2.303386 9.856 10.0
9 9 2.303394 9.918 10.0
10 10 2.303466 10.038 10.0
11 11 2.303409 9.746 10.0
12 12 2.303455 9.816 10.0
13 13 2.303431 9.652 10.0
14 14 2.303427 10.058 10.0
15 15 2.303305 10.008 10.0
16 16 2.303359 9.984 10.0
17 17 2.303459 9.732 10.0
18 18 2.303328 9.990 10.0
19 19 2.303439 9.796 10.0
20 20 2.303355 10.046 10.0
21 21 2.303474 9.860 10.0
22 22 2.303456 9.796 10.0
23 23 2.303373 9.754 10.0
24 24 2.303384 9.878 10.0
25 25 2.303368 10.108 10.0
26 26 2.303289 10.134 10.0
27 27 2.303480 9.780 10.0
28 28 2.303363 10.092 10.0
29 29 2.303490 9.904 10.0
30 30 2.303408 10.010 10.0
31 31 2.303382 9.964 10.0
32 32 2.303359 10.090 10.0
33 33 2.303454 9.878 10.0
34 34 2.303350 9.952 10.0
35 35 2.303503 9.856 10.0
36 36 2.303527 9.804 10.0
37 37 2.303480 9.622 10.0
38 38 2.303351 9.922 10.0
39 39 2.303333 10.048 10.0
40 40 2.303416 9.938 10.0
41 41 2.303465 10.212 10.0
42 42 2.303334 10.130 10.0
43 43 2.303442 9.878 10.0
44 44 2.303478 9.854 10.0
45 45 2.303334 10.044 10.0
46 46 2.303436 9.952 10.0
47 47 2.303437 9.856 10.0
48 48 2.303453 10.038 10.0
49 49 2.303413 10.086 10.0
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/6-0.001a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/6-0.001b.png')
Out[ ]:

(7)¶

Please add some data augmentation to avoid overfitting. Note that you need to do this only for the trainnig and not the testing. You may use line 208 from Imagenet sample code: https://github.com/pytorch/examples/blob/master/imagenet/main.py ”RandomResizedCrop” samples a random patch from the image to train the model on. ”RandomHorizontalFlip” flips randomly chosen images horizontally.

Answer:

Using data augmentation, the model seems to perfrom well. The accuracy plot shows that the training and test accuracies are somewhat closer compare to the accuracy plots shown in the previous sections. The plot also also shows that the loss smoothly decreases to a small value after multiple epochs.

In [ ]:
train_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
])

test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=train_transform)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=test_transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Files already downloaded and verified
Files already downloaded and verified
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
 bird  deer  bird  frog
2022-12-01T16:20:07.751961 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
optimizer = optim.SGD(MultiNet.parameters(), lr=0.0002, momentum=0.9) #use the same learning rate as part 4 
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        #print(data)
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 1.8086859489290301 | train accuracy: 33.416 | test accuracy: 39.96 
*** Epoch 1 ***
loss: 1.6852646669260702 | train accuracy: 37.898 | test accuracy: 41.18 
*** Epoch 2 ***
loss: 1.6212948598960122 | train accuracy: 40.464 | test accuracy: 44.69 
*** Epoch 3 ***
loss: 1.543987396402601 | train accuracy: 43.186 | test accuracy: 48.8 
*** Epoch 4 ***
loss: 1.4645929705625496 | train accuracy: 46.532 | test accuracy: 51.17 
*** Epoch 5 ***
loss: 1.4004847430803269 | train accuracy: 49.39 | test accuracy: 52.56 
*** Epoch 6 ***
loss: 1.3558163988927867 | train accuracy: 51.168 | test accuracy: 55.21 
*** Epoch 7 ***
loss: 1.3229486953521674 | train accuracy: 52.366 | test accuracy: 56.5 
*** Epoch 8 ***
loss: 1.2946175028758438 | train accuracy: 53.42 | test accuracy: 57.2 
*** Epoch 9 ***
loss: 1.266718249699766 | train accuracy: 54.482 | test accuracy: 58.9 
*** Epoch 10 ***
loss: 1.2446251323667037 | train accuracy: 55.392 | test accuracy: 58.92 
*** Epoch 11 ***
loss: 1.2192121288507487 | train accuracy: 56.414 | test accuracy: 59.68 
*** Epoch 12 ***
loss: 1.199471732714036 | train accuracy: 57.06 | test accuracy: 60.81 
*** Epoch 13 ***
loss: 1.1788874973812498 | train accuracy: 57.852 | test accuracy: 61.38 
*** Epoch 14 ***
loss: 1.1656151195675557 | train accuracy: 58.306 | test accuracy: 61.17 
*** Epoch 15 ***
loss: 1.146689116318473 | train accuracy: 59.264 | test accuracy: 62.48 
*** Epoch 16 ***
loss: 1.14040867660735 | train accuracy: 59.506 | test accuracy: 62.32 
*** Epoch 17 ***
loss: 1.1283945909741748 | train accuracy: 59.74 | test accuracy: 62.52 
*** Epoch 18 ***
loss: 1.1165165899988914 | train accuracy: 60.442 | test accuracy: 61.33 
*** Epoch 19 ***
loss: 1.1123623348354068 | train accuracy: 60.664 | test accuracy: 64.09 
*** Epoch 20 ***
loss: 1.1005332604764757 | train accuracy: 60.872 | test accuracy: 63.19 
*** Epoch 21 ***
loss: 1.088187068511605 | train accuracy: 61.172 | test accuracy: 63.57 
*** Epoch 22 ***
loss: 1.085720523512011 | train accuracy: 61.478 | test accuracy: 62.57 
*** Epoch 23 ***
loss: 1.075500167663823 | train accuracy: 61.758 | test accuracy: 64.2 
*** Epoch 24 ***
loss: 1.0688215364162654 | train accuracy: 61.916 | test accuracy: 64.86 
*** Epoch 25 ***
loss: 1.0628240575665358 | train accuracy: 62.23 | test accuracy: 64.79 
*** Epoch 26 ***
loss: 1.0616199600830922 | train accuracy: 62.364 | test accuracy: 65.51 
*** Epoch 27 ***
loss: 1.0567108106164063 | train accuracy: 62.512 | test accuracy: 65.5 
*** Epoch 28 ***
loss: 1.0503540233508406 | train accuracy: 62.764 | test accuracy: 64.83 
*** Epoch 29 ***
loss: 1.0485929869339186 | train accuracy: 62.6 | test accuracy: 65.29 
*** Epoch 30 ***
loss: 1.0418785761751808 | train accuracy: 63.12 | test accuracy: 65.84 
*** Epoch 31 ***
loss: 1.0452401510789497 | train accuracy: 63.11 | test accuracy: 66.18 
*** Epoch 32 ***
loss: 1.0346196626621287 | train accuracy: 63.444 | test accuracy: 65.73 
*** Epoch 33 ***
loss: 1.0326931505004355 | train accuracy: 63.516 | test accuracy: 64.6 
*** Epoch 34 ***
loss: 1.028465756907295 | train accuracy: 63.466 | test accuracy: 65.49 
*** Epoch 35 ***
loss: 1.0219423703153971 | train accuracy: 63.808 | test accuracy: 65.76 
*** Epoch 36 ***
loss: 1.02043124205076 | train accuracy: 63.646 | test accuracy: 66.2 
*** Epoch 37 ***
loss: 1.0215531678201801 | train accuracy: 63.922 | test accuracy: 66.47 
*** Epoch 38 ***
loss: 1.0165595432588594 | train accuracy: 63.82 | test accuracy: 66.5 
*** Epoch 39 ***
loss: 1.017402517922173 | train accuracy: 63.944 | test accuracy: 66.13 
*** Epoch 40 ***
loss: 1.0146785848729682 | train accuracy: 64.192 | test accuracy: 66.72 
*** Epoch 41 ***
loss: 1.0040433588455246 | train accuracy: 64.632 | test accuracy: 66.93 
*** Epoch 42 ***
loss: 1.008105358298734 | train accuracy: 64.214 | test accuracy: 66.82 
*** Epoch 43 ***
loss: 1.009657993014085 | train accuracy: 64.254 | test accuracy: 66.87 
*** Epoch 44 ***
loss: 1.0003421225562168 | train accuracy: 64.554 | test accuracy: 66.36 
*** Epoch 45 ***
loss: 1.0027526100115847 | train accuracy: 64.462 | test accuracy: 67.22 
*** Epoch 46 ***
loss: 0.9999213297556689 | train accuracy: 64.71 | test accuracy: 66.76 
*** Epoch 47 ***
loss: 0.9936627089307851 | train accuracy: 64.87 | test accuracy: 66.34 
*** Epoch 48 ***
loss: 0.9889910577927764 | train accuracy: 64.844 | test accuracy: 67.57 
*** Epoch 49 ***
loss: 0.9895313216693874 | train accuracy: 64.96 | test accuracy: 68.32 
CPU times: total: 1h 21min 12s
Wall time: 51min 41s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 1.808686 33.416 39.96
1 1 1.685265 37.898 41.18
2 2 1.621295 40.464 44.69
3 3 1.543987 43.186 48.80
4 4 1.464593 46.532 51.17
5 5 1.400485 49.390 52.56
6 6 1.355816 51.168 55.21
7 7 1.322949 52.366 56.50
8 8 1.294618 53.420 57.20
9 9 1.266718 54.482 58.90
10 10 1.244625 55.392 58.92
11 11 1.219212 56.414 59.68
12 12 1.199472 57.060 60.81
13 13 1.178887 57.852 61.38
14 14 1.165615 58.306 61.17
15 15 1.146689 59.264 62.48
16 16 1.140409 59.506 62.32
17 17 1.128395 59.740 62.52
18 18 1.116517 60.442 61.33
19 19 1.112362 60.664 64.09
20 20 1.100533 60.872 63.19
21 21 1.088187 61.172 63.57
22 22 1.085721 61.478 62.57
23 23 1.075500 61.758 64.20
24 24 1.068822 61.916 64.86
25 25 1.062824 62.230 64.79
26 26 1.061620 62.364 65.51
27 27 1.056711 62.512 65.50
28 28 1.050354 62.764 64.83
29 29 1.048593 62.600 65.29
30 30 1.041879 63.120 65.84
31 31 1.045240 63.110 66.18
32 32 1.034620 63.444 65.73
33 33 1.032693 63.516 64.60
34 34 1.028466 63.466 65.49
35 35 1.021942 63.808 65.76
36 36 1.020431 63.646 66.20
37 37 1.021553 63.922 66.47
38 38 1.016560 63.820 66.50
39 39 1.017403 63.944 66.13
40 40 1.014679 64.192 66.72
41 41 1.004043 64.632 66.93
42 42 1.008105 64.214 66.82
43 43 1.009658 64.254 66.87
44 44 1.000342 64.554 66.36
45 45 1.002753 64.462 67.22
46 46 0.999921 64.710 66.76
47 47 0.993663 64.870 66.34
48 48 0.988991 64.844 67.57
49 49 0.989531 64.960 68.32
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/7a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/7b.png')
Out[ ]:

(8)¶

Change the loss function from Cross Entropy to Mean Squared Error and report the effect.

Answer:

Switching from cross-entropy to mean-squared error very slightly reduces the overall accuracy since MSE does not perform well in classification generally. Multi-Classfication problems are not typically convex, continuous or differentiable so MSE is not very useful in this case. MSE is more useful in regression problems where the target to be predicted is numerical. When used in classification problems, MSE tends to emphasize incorrect outputs.The losses obtained also appears to be much smaller than ones obtained using cross entropy but we can't compare them using these numberical values since they are different. The time it takes to train and test using mse is not very different from cross entropy.

In [ ]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


# get some random training images
dataiter = iter(trainloader)
images, labels = next(dataiter)

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Files already downloaded and verified
Files already downloaded and verified
 frog   cat   car  frog
2022-12-01T17:20:42.997218 image/svg+xml Matplotlib v3.5.1, https://matplotlib.org/
In [ ]:
criterion = nn.MSELoss()
optimizer = optim.SGD(MultiNet.parameters(), lr=0.0002, momentum=0.9)
In [ ]:
#one-hot encoder 
def Labelarrays(labels):
    m = np.zeros((labels.shape[0], 10))
    for i in range(len(labels)):
        m[i][labels[i]] = 1

    return torch.FloatTensor(m)
In [ ]:
#revised accuracy func
def accuracy(pred, labels):
    _, predicted = torch.max(pred, dim=1)
    _, labels = torch.max(labels, dim=1)
    correct_pred = torch.sum(predicted==labels).item()
    total_pred = len(predicted)
    accuracy = 100*(correct_pred/total_pred)
    return accuracy
In [ ]:
%%time
#train and test model 
train_loss = [] #training loss 
train_acc = [] #training accuracy
test_acc = [] #testing accuracy 

for epoch in range(50): 
    print("*** Epoch {} ***".format(epoch)) #print current epoch 
    
    #train
    total_loss  = 0.0
    train_accs = []
    for i, data in enumerate(trainloader, 0):
        #print(data)
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        labels = Labelarrays(labels)
        optimizer.zero_grad()
        outputs = MultiNet(inputs)
        loss = criterion(outputs, labels)
        total_loss += loss.item()
        tr_acc = accuracy(outputs,labels)
        loss.backward()
        optimizer.step()
        train_accs.append(tr_acc)
    #save train results 
    train_loss_per_epoch = total_loss/i
    train_loss.append(train_loss_per_epoch) 
    train_acc_per_epoch = sum(train_accs)/len(train_accs)
    train_acc.append(train_acc_per_epoch)
    
    #test 
    test_accs = []
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            labels = Labelarrays(labels)
            outputs = MultiNet(images)
            te_acc = accuracy(outputs,labels)
            test_accs.append(te_acc)

    #save test results 
    test_acc_per_epoch = sum(test_accs)/len(test_accs)
    test_acc.append(test_acc_per_epoch)

    print('loss: {} | train accuracy: {} | test accuracy: {} '.format(train_loss_per_epoch, train_acc_per_epoch, test_acc_per_epoch))
*** Epoch 0 ***
loss: 0.06802025815040286 | train accuracy: 52.654 | test accuracy: 52.4 
*** Epoch 1 ***
loss: 0.066779491464568 | train accuracy: 53.828 | test accuracy: 53.86 
*** Epoch 2 ***
loss: 0.06563827798965073 | train accuracy: 54.892 | test accuracy: 53.93 
*** Epoch 3 ***
loss: 0.06462351817237405 | train accuracy: 55.912 | test accuracy: 55.23 
*** Epoch 4 ***
loss: 0.06367916419620923 | train accuracy: 56.586 | test accuracy: 55.63 
*** Epoch 5 ***
loss: 0.06285115087698274 | train accuracy: 57.05 | test accuracy: 56.47 
*** Epoch 6 ***
loss: 0.06212191315299565 | train accuracy: 57.672 | test accuracy: 56.04 
*** Epoch 7 ***
loss: 0.061439274911665086 | train accuracy: 58.142 | test accuracy: 58.1 
*** Epoch 8 ***
loss: 0.06080624803473617 | train accuracy: 58.656 | test accuracy: 58.05 
*** Epoch 9 ***
loss: 0.06025298834328392 | train accuracy: 59.094 | test accuracy: 58.25 
*** Epoch 10 ***
loss: 0.059720169354719854 | train accuracy: 59.516 | test accuracy: 58.74 
*** Epoch 11 ***
loss: 0.059223800740676485 | train accuracy: 59.916 | test accuracy: 59.2 
*** Epoch 12 ***
loss: 0.05875307595781569 | train accuracy: 60.36 | test accuracy: 59.57 
*** Epoch 13 ***
loss: 0.05831072744942086 | train accuracy: 60.91 | test accuracy: 59.65 
*** Epoch 14 ***
loss: 0.05791828676058868 | train accuracy: 61.028 | test accuracy: 59.68 
*** Epoch 15 ***
loss: 0.05754823614977683 | train accuracy: 61.34 | test accuracy: 60.43 
*** Epoch 16 ***
loss: 0.05721634893155578 | train accuracy: 61.51 | test accuracy: 60.27 
*** Epoch 17 ***
loss: 0.056868336811788094 | train accuracy: 61.82 | test accuracy: 60.58 
*** Epoch 18 ***
loss: 0.05655949327658678 | train accuracy: 62.028 | test accuracy: 60.68 
*** Epoch 19 ***
loss: 0.05624923055663082 | train accuracy: 62.312 | test accuracy: 61.25 
*** Epoch 20 ***
loss: 0.055954940440305335 | train accuracy: 62.452 | test accuracy: 61.39 
*** Epoch 21 ***
loss: 0.055674265236750634 | train accuracy: 62.752 | test accuracy: 61.81 
*** Epoch 22 ***
loss: 0.05540584700016068 | train accuracy: 62.938 | test accuracy: 61.33 
*** Epoch 23 ***
loss: 0.055170781411861235 | train accuracy: 63.146 | test accuracy: 61.59 
*** Epoch 24 ***
loss: 0.05491785137779329 | train accuracy: 63.254 | test accuracy: 62.18 
*** Epoch 25 ***
loss: 0.05467793625257942 | train accuracy: 63.426 | test accuracy: 62.54 
*** Epoch 26 ***
loss: 0.054430398855861165 | train accuracy: 63.658 | test accuracy: 61.18 
*** Epoch 27 ***
loss: 0.05421820962233796 | train accuracy: 63.886 | test accuracy: 62.71 
*** Epoch 28 ***
loss: 0.05400333980993104 | train accuracy: 63.92 | test accuracy: 62.47 
*** Epoch 29 ***
loss: 0.05380644808394526 | train accuracy: 64.092 | test accuracy: 62.52 
*** Epoch 30 ***
loss: 0.05360067517364238 | train accuracy: 64.318 | test accuracy: 63.07 
*** Epoch 31 ***
loss: 0.05339412216746263 | train accuracy: 64.466 | test accuracy: 62.68 
*** Epoch 32 ***
loss: 0.05322335976044749 | train accuracy: 64.63 | test accuracy: 62.93 
*** Epoch 33 ***
loss: 0.05302784261688249 | train accuracy: 64.716 | test accuracy: 63.05 
*** Epoch 34 ***
loss: 0.05284703305458017 | train accuracy: 64.668 | test accuracy: 63.51 
*** Epoch 35 ***
loss: 0.052711097446301355 | train accuracy: 64.846 | test accuracy: 63.78 
*** Epoch 36 ***
loss: 0.05251772076262414 | train accuracy: 64.984 | test accuracy: 63.52 
*** Epoch 37 ***
loss: 0.052361183970286176 | train accuracy: 65.15 | test accuracy: 63.46 
*** Epoch 38 ***
loss: 0.052206025962795395 | train accuracy: 65.328 | test accuracy: 63.59 
*** Epoch 39 ***
loss: 0.05207929197970501 | train accuracy: 65.372 | test accuracy: 63.74 
*** Epoch 40 ***
loss: 0.05193330268857294 | train accuracy: 65.538 | test accuracy: 64.12 
*** Epoch 41 ***
loss: 0.05178926488567084 | train accuracy: 65.584 | test accuracy: 64.3 
*** Epoch 42 ***
loss: 0.05164872976320261 | train accuracy: 65.75 | test accuracy: 63.96 
*** Epoch 43 ***
loss: 0.05153816537009106 | train accuracy: 65.942 | test accuracy: 64.37 
*** Epoch 44 ***
loss: 0.051412489485646536 | train accuracy: 65.982 | test accuracy: 64.28 
*** Epoch 45 ***
loss: 0.05128572552926652 | train accuracy: 66.138 | test accuracy: 64.7 
*** Epoch 46 ***
loss: 0.051171268040625925 | train accuracy: 66.086 | test accuracy: 64.4 
*** Epoch 47 ***
loss: 0.051053592488655486 | train accuracy: 66.236 | test accuracy: 64.68 
*** Epoch 48 ***
loss: 0.050952112089751055 | train accuracy: 66.21 | test accuracy: 64.64 
*** Epoch 49 ***
loss: 0.05084814298419616 | train accuracy: 66.414 | test accuracy: 64.81 
CPU times: total: 2h 12min 25s
Wall time: 46min 29s
In [ ]:
df1 = pd.DataFrame.from_dict(train_loss)
df2 = pd.DataFrame.from_dict(train_acc) #, columns=['Train accuracy'])
df3 = pd.DataFrame.from_dict(test_acc) #, columns=['Test accuracy'])
dfs = [df1, df2, df3]
df = pd.concat(dfs, axis=1)
df.columns = ['Train Loss', 'Train accuracy', 'Test accuracy']
df = df.rename_axis('epochs').reset_index()
df
Out[ ]:
epochs Train Loss Train accuracy Test accuracy
0 0 0.068020 52.654 52.40
1 1 0.066779 53.828 53.86
2 2 0.065638 54.892 53.93
3 3 0.064624 55.912 55.23
4 4 0.063679 56.586 55.63
5 5 0.062851 57.050 56.47
6 6 0.062122 57.672 56.04
7 7 0.061439 58.142 58.10
8 8 0.060806 58.656 58.05
9 9 0.060253 59.094 58.25
10 10 0.059720 59.516 58.74
11 11 0.059224 59.916 59.20
12 12 0.058753 60.360 59.57
13 13 0.058311 60.910 59.65
14 14 0.057918 61.028 59.68
15 15 0.057548 61.340 60.43
16 16 0.057216 61.510 60.27
17 17 0.056868 61.820 60.58
18 18 0.056559 62.028 60.68
19 19 0.056249 62.312 61.25
20 20 0.055955 62.452 61.39
21 21 0.055674 62.752 61.81
22 22 0.055406 62.938 61.33
23 23 0.055171 63.146 61.59
24 24 0.054918 63.254 62.18
25 25 0.054678 63.426 62.54
26 26 0.054430 63.658 61.18
27 27 0.054218 63.886 62.71
28 28 0.054003 63.920 62.47
29 29 0.053806 64.092 62.52
30 30 0.053601 64.318 63.07
31 31 0.053394 64.466 62.68
32 32 0.053223 64.630 62.93
33 33 0.053028 64.716 63.05
34 34 0.052847 64.668 63.51
35 35 0.052711 64.846 63.78
36 36 0.052518 64.984 63.52
37 37 0.052361 65.150 63.46
38 38 0.052206 65.328 63.59
39 39 0.052079 65.372 63.74
40 40 0.051933 65.538 64.12
41 41 0.051789 65.584 64.30
42 42 0.051649 65.750 63.96
43 43 0.051538 65.942 64.37
44 44 0.051412 65.982 64.28
45 45 0.051286 66.138 64.70
46 46 0.051171 66.086 64.40
47 47 0.051054 66.236 64.68
48 48 0.050952 66.210 64.64
49 49 0.050848 66.414 64.81
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.85), title = 'Loss for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Loss')
subop = {'Train Loss': df[ 'Train Loss']}
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/8a.png')
Out[ ]:
In [ ]:
fig = px.line()
fig.update_layout(template = 'plotly_dark',legend=dict(title = '', 
    yanchor="top",
    y=0.25,
    xanchor="left",
    x=0.95), title = 'Accuracies for every epoch')
fig.update_xaxes(title_text='Epochs')
fig.update_yaxes(title_text='Accuracy')
subop = {#'Train Loss': df[ 'Train Loss'],
         'Train accuracy': df[ 'Train accuracy'],
         'Test accuracy': df[ 'Test accuracy'] }
for k, v in subop.items():
    fig.add_scatter(x=v.index, y = v, name = k )
fig.show()
In [ ]:
# Since I forgot to add `pio.renderers.default='notebook'` before plotting majority of my results, 
# I have to save most of the plotly plots I generated and redisplay their non-interactive image version 
# to avoid rerunning all the training and testing codes and save time 

Image(filename='plots/8b.png')
Out[ ]: